Vaccine

ABSTRACT

This invention relates to novel HIV polypeptide and polynucleotide fusions of Gag, Pol and Nef which are useful in immunogenic compositions and vaccines. The invention relates in particular to a polypeptide which comprises Nef or an immunogenic fragment thereof, and p17 Gag and/or p24 Gag or immunogenic fragments thereof, wherein when both p17 and p24 Gag are present there is at least one HIV antigen or immunogenic fragment between them. The polypeptide may also comprise Pol or RT or an immunogenic fragment thereof.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a divisional of U.S. Ser. No. 11/573,128, filed 2Feb. 2007, which is a §371 application of PCT/EP2005/008434, filed 3Aug. 2005, the disclosure of which is incorporated herein by reference.This application also claims benefit of the earlier filing date of GreatBritain Patent Application No. 0417494.2, filed 5 Aug. 2004.

FIELD OF THE INVENTION

The present invention relates to novel HIV polypeptide constructs, totheir use in medicine, to pharmaceutical compositions comprising themand to methods for their manufacture. The invention also relates topolynucleotides encoding the polypeptides. In particular, the inventionrelates to fusion proteins comprising HIV-1 Nef and HIV-1 Gag orfragments thereof, and to polynucleotides encoding them. Moreparticularly, the invention relates to fusion proteins comprising HIV-1Nef, HIV-1 Pol and HIV-1 Gag proteins or fragments thereof and topolynucleotides encoding them.

HIV-1 is the primary cause of the acquired immune deficiency syndrome(AIDS) which is regarded as one of the world's major health problems.There is a need for a vaccine for the prevention and/or treatment of HIVinfection.

BACKGROUND TO THE INVENTION

HIV-1 is an RNA virus of the family Retroviridiae. The HIV genomeencodes at least nine proteins which are divided into three classes: themajor structural proteins Gag, Pol and Env, the regulatory proteins Tatand Rev, and the accessory proteins Vpu, Vpr, Vif and Nef. The HIVgenome exhibits the 5′LTR-gag-pol-env-LTR3′ organization of allretroviruses.

The HIV envelope glycoprotein gp120 is the viral protein that is usedfor attachment to the host cell. This attachment is mediated by bindingto two surface molecules of helper T cells and macrophages, known as CD4and one of the two chemokine receptors CCR-5 or CXCR-4. The gp120protein is first expressed as a larger precursor molecule (gp160), whichis then cleaved post-translationally to yield gp120 and gp41. The gp120protein is retained on the surface of the virion by linkage to the gp41molecule, which is inserted into the viral membrane.

The gp120 protein is the principal target of neutralizing antibodies,but unfortunately the most immunogenic regions of the proteins (V3 loop)are also the most variable parts of the protein. Therefore, the use ofgp120 (or its precursor gp160) as a vaccine antigen to elicitneutralizing antibodies is thought to be of limited use for a broadlyprotective vaccine. The gp120 protein does also contain epitopes thatare recognized by cytotoxic T lymphocytes (CTL). These effector cellsare able to eliminate virus-infected cells, and therefore constitute asecond major antiviral immune mechanism. In contrast to the targetregions of neutralizing antibodies some CTL epitopes appear to berelatively conserved among different HIV strains. For this reason gp120and gp160 maybe useful antigenic components in vaccines that aim ateliciting cell-mediated immune responses (particularly CTL).

Non-envelope proteins of HIV-1 include for example internal structuralproteins such as the products of the gag and pol genes and othernon-structural proteins such as Rev, Nef, Vif and Tat (Green et al., NewEngland J. Med, 324, 5, 308 et seq (1991) and Bryant et al. (Ed. Pizzo),Pediatr. Infect. Dis. J., 11, 5, 390 et seq (1992).

HIV Nef is an early protein, that is it is expressed early in infectionand in the absence of structural protein.

The Nef gene encodes an early accessory HIV protein which has been shownto possess several activities. For example, the Nef protein is known tocause the down regulation of CD4, the HIV receptor, and MHC class Imolecules from the cell surface, although the biological importance ofthese functions is debated. Additionally Nef interacts with the signalpathway of T cells and induces an active state, which in turn maypromote more efficient gene expression. Some HIV isolates have mutationsin this region, which cause them not to encode functional protein andare severely compromised in their replication and pathogenesis in vivo.

The Gag gene is translated as a precursor polyprotein that is cleaved byprotease to yield products that include the matrix protein (p17), thecapsid (p24), the nucleocapsid (p9), p6 and two space peptides, p2 andp1.

The Gag gene gives rise to the 55-kilodalton (kD) Gag precursor protein,also called p55, which is expressed from the unspliced viral mRNA.During translation, the N terminus of p55 is myristoylated, triggeringits association with the cytoplasmic aspect of cell membranes. Themembrane-associated Gag polyprotein recruits two copies of the viralgenomic RNA along with other viral and cellular proteins that triggersthe budding of the viral particle from the surface of an infected cell.After budding, p55 is cleaved by the virally encoded protease (a productof the pol gene) during the process of viral maturation into foursmaller proteins designated MA (matrix [p17]), CA (capsid [p24]), NC(nucleocapsid [p9]), and p6.

In addition to the 3 major Gag proteins, all Gag precursors containseveral other regions, which are cleaved out and remain in the virion aspeptides of various sizes. These proteins have different roles e.g. thep2 protein has a proposed role in regulating activity of the proteaseand contributes to the correct timing of proteolytic processing.

The p17 (MA) polypeptide is derived from the N-terminal, myristoylatedend of p55. Most MA molecules remain attached to the inner surface ofthe virion lipid bilayer, stabilizing the particle. A subset of MA isrecruited inside the deeper layers of the virion where it becomes partof the complex which escorts the viral DNA to the nucleus. These MAmolecules facilitate the nuclear transport of the viral genome because akaryophilic signal on MA is recognized by the cellular nuclear importmachinery. This phenomenon allows HIV to infect non-dividing cells, anunusual property for a retrovirus.

The p24 (CA) protein forms the conical core of viral particles.Cyclophilin A has been demonstrated to interact with the p24 region ofp55 leading to its incorporation into HIV particles. The interactionbetween Gag and cyclophilin A is essential because the disruption ofthis interaction by cyclosporin A inhibits viral replication.

The NC region of Gag is responsible for specifically recognizing theso-called packaging signal of HIV. The packaging signal consists of fourstem loop structures located near the 5′ end of the viral RNA, and issufficient to mediate the incorporation of a heterologous RNA into HIV-1virions. NC binds to the packaging signal through interactions mediatedby two zinc-finger motifs. NC also facilitates reverse transcription.

The p6 polypeptide region mediates interactions between p55 Gag and theaccessory protein Vpr, leading to the incorporation of Vpr intoassembling virions. The p6 region also contains a so-called late domainwhich is required for the efficient release of budding virions from aninfected cell.

The Pol gene encodes two proteins containing the two activities neededby the virus in early infection, the RT and the integrase protein neededfor integration of viral DNA into cell DNA. The primary product of Polis cleaved by the virion protease to yield the amino terminal RT peptidewhich contains activities necessary for DNA synthesis (RNA andDNA-dependent DNA polymerase activity as well as an RNase H function)and carboxy terminal integrase protein. HIV RT is a heterodimer offull-length RT (p66) and a cleavage product (p51) lacking the carboxyterminal RNase H domain.

RT is one of the most highly conserved proteins encoded by theretroviral genome. Two major activities of RT are the DNA Pol andRibonuclease H. The DNA Pol activity of RT uses RNA and DNA as templatesinterchangeably and like all DNA polymerases known is unable to initiateDNA synthesis de novo, but requires a pre existing molecule to serve asa primer (RNA).

The RNase H activity inherent in all RT proteins plays the essentialrole early in replication of removing the RNA genome as DNA synthesisproceeds. It selectively degrades the RNA from all RNA-DNA hybridmolecules. Structurally the polymerase and ribo H occupy separate,non-overlapping domains with the Pol covering the amino two thirds ofthe Pol.

The p66 catalytic subunit is folded into 5 distinct subdomains. Theamino terminal 23 of these have the portion with RT activity. Carboxyterminal to these is the RNase H Domain.

WO 03/025003 describes DNA constructs encoding HIV-1 p17/24 Gag, Nef andRT, wherein the DNA sequences may be codon optimized to resemble highlyexpressed human genes. These constructs are useful in DNA vaccines.

Fusion proteins containing multiple HIV antigens have been suggested asvaccine candidates for HIV, for example the Nef-Tat fusion as describedin WO 99/16884. However, fusion proteins are not straightforward toproduce; there can be difficulties in expressing them because they donot correspond to native proteins. There can be difficulties at thetranscription level, or further downstream. Also, they may not bestraightforward to formulate into a pharmaceutically acceptablecomposition. Notably, the majority of approaches to HIV vaccines thatinvolve multiple antigens fused together, are DNA or live vectorapproaches rather than polypeptide fusion proteins.

SUMMARY OF THE INVENTION

The present invention provides novel constructs for use in vaccines forthe prophylaxis and treatment of HIV infections and AIDS.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A and B are images of Coomassie stained gels (upper panels) andwestern blots (lower panels) of F4 p24-RT-Nef-P17.

FIG. 2 is images of a Coomassie stained gels (upper panel) and westernblot (lower panel) of codon-optimized F4.

FIG. 3 is an alignment of FT proteins.

FIG. 4 is images of a Coomassie stained gel (left panel) and westernblot (right panel) of P51 RT (codon optimized).

FIG. 5 is images of a Coomassie stained gel (left panel) and westernblot (right panel) illustrating a solubility of RT/P51 and RT/p66.

FIG. 6 is images of a Coomassie stained gel (left panel) and westernblot (right panel) showing expression of Nef-p17 and p17-Nef fusions.

FIG. 7 is images of a Coomassie stained gel (left panel) and westernblot (right panel) illustrating solubility of Nef-p 17, p17-Nef and Nefproteins.

FIG. 8 is images of a Coomassie stained gel (left panel) and westernblots (three right panels) showing expression of F4 fusion protein.

FIG. 9 is images of a Coomassie stained gel (upper panel) and westernblots (lower panels) showing expression of F3 fusion proteins.

FIG. 10 is images of a Coomassie stained gel (left panel) and westernblot (right panel) showing expression of F3* fusion proteins.

FIG. 11 is images of a Coomassie stained gel (upper panel) and westernblots (lower panels) showing expression of F4(p51).

FIG. 12 is images of a Coomassie stained gel (left panel) and westernblots (two right panels) showing expression of F4(p51) and F4(p51)*fusion proteins.

FIGS. 13A and B are images of a Coomassie stained gel (A) and westernblot (B) showing purification of F4 fusion protein.

FIGS. 14 A and B are images of a Coomassie stained gel (A) and westernblot (B) showing purification of F4(p51)*.

FIGS. 15A and B are images of a Coomassie stained gel (A) and westernblot (B) showing purification of F4 fusion protein.

FIGS. 16A and B are images of a Coomassie stained gel (A) and westernblot (B) showing a comparison of purity between F4, F4* and F4(p51)*fusion proteins.

FIG. 17 is an image of a Coomassie stained gel following purification ofF4co and carboxyamidated F4co.

FIG. 18 is an image of a Coomassie stained gel following purification ofF4, F4co and F4coca.

FIG. 19 is a series of bar graphs showing immunogenicity in mice.

FIG. 20 is two bar graphs showing immunogenicity in mice.

DETAILED DESCRIPTION

In one aspect the invention provides a polypeptide which comprises Nefor an immunogenic fragment or derivative thereof, and p17 Gag and/or p24Gag or immunogenic fragments or derivatives thereof, wherein when bothp17 and p24 Gag are present there is at least one HIV antigen orimmunogenic fragment between them.

In the constructs and compositions according to the invention asdescribed herein, the Nef is preferably a full length Nef.

In the constructs according to the invention the p17 Gag and p24 Gag arepreferably full length p17 and p24 respectively.

In one embodiment the polypeptide comprises both p17 and p24 Gag orimmunogenic fragments thereof. In such a construct the p24 Gag componentand p17 Gag component are separated by at least one further HIV antigenor immunogenic fragment, such as Nef and/or RT or immunogenic fragmentsor derivatives thereof.

Alternatively p17 or p24 Gag may be provided separately. Thus theinvention also provides a composition comprising (i) a polypeptide whichcomprises Nef or an immunogenic fragment or derivative thereof and p17Gag or an immunogenic fragment or derivative thereof, and (ii) p24 Gagor an immunogenic fragment or derivative thereof; or (i) a polypeptidewhich comprises Nef or an immunogenic fragment or derivative thereof andp24 Gag or an immunogenic fragment or derivative thereof, and (ii) p17Gag or an immunogenic fragment or derivative thereof.

In another embodiment the polypeptide construct according to theinvention further comprises Pol or a derivative of Pol such as RT or animmunogenic fragment or derivative thereof. Particular fragments of RTthat are suitable for use in the invention are fragments in which the RTis truncated at the C terminus, preferably such that they lack thecarboxy terminal RNase H domain. One such fragment lacking the carboxyterminal Rnase H domain is the p51 fragment described herein.

Preferably the RT or immunogenic fragment in the fusion proteinsdescribed herein is p66 RT or p51 RT.

The RT component of the fusion protein or composition according to theinvention optionally comprises a mutation at position 592, or equivalentmutation in strains other than HXB2, such that the methionine is removedby mutation to another residue e.g. lysine. The purpose of this mutationis to remove a site which serves as an internal initiation site inprokaryotic expression systems.

The RT component also, or alternatively, comprises a mutation to removethe enzyme activity (reverse transcriptase). Thus K231 may be presentinstead of W.

In fusion proteins according to the invention which comprise p24 and RT,it may be preferable that the p24 precedes the RT in the constructbecause when the antigens are expressed alone in E. coli betterexpression of p24 than of RT is observed.

Preferred constructs according to the invention include the following:

1. p24-RT-Nef-p172. p24-RT*-Nef-p173. p24-p51RT-Nef-p174. p24-p51RT*-Nef-p175. p17-p51RT-Nef6. p17-p51RT*-Nef

7. Nef-p17

8. Nef-p17 with linker9. p17-Nef10. p17-Nef with linker* represents RT methionine₅₉₂ mutation to lysine

The linker included in the constructs listed above may be any shortamino acid sequence for decreasing potential interactions between thetwo fusion partners that it links together. The linker may be forexample from 4-10 amino acids in length. For example, it may be a 6amino acid sequence such as the GSGGGP sequence (SEQ ID NO:20) describedherein in the examples.

In another aspect the present invention provides a fusion protein of HIVantigens comprising at least four HIV antigens or immunogenic fragments,wherein the four antigens or fragments are or are derived from Nef, Poland Gag. Preferably Gag is present as two separate components which areseparated by at least one other antigen in the fusion. Preferably theNef is full length Nef. Preferably the Pol is p66 or p51RT. Preferablythe Gag is p17 and p24 Gag. Other preferred features and properties ofthe antigen components of the fusion in this aspect of the invention areas described herein.

Preferred embodiments of this aspect of the invention are the fourcomponent fusions as already listed above:

-   -   1. p24-RT-Nef-p17    -   2. p24-RT*-Nef-p17    -   3. p24-p51RT-Nef-p17    -   4. p24-p51RT*-Nef-p17

The term “derived from” or “derivative” in relation to the HIV antigensincluded in the invention means that the antigens may have been alteredin a limited way compared to their native counterparts. This includespoint mutations which may change the properties of the protein forexample by improving expression in prokaryotic systems or removingundesirable activity including undesirable enzyme activity. The pointmutations described herein for RT are designed to achieve these things.However, the antigens must remain sufficiently similar to the nativeantigens such that they retain the antigenic properties desirable in avaccine and thus they remain capable of raising an immune responseagainst the native antigen. Whether or not a particular derivativeraises such an immune response may be measured by a suitableimmunological assay such as an ELISA (for antibody responses) or flowcytometry using suitable staining for cellular markers and cytokines(for cellular responses).

The polypeptide constructs of HIV antigens according to the inventionare capable of being expressed in in vitro systems including prokaryoticsystems such as E. coli. Advantageously they can be purified byconventional purification methods.

The fusions described herein are preferably soluble when expressed in aselected expression system, that is they are present in a substantialamount in the supernatant of a crude extract from the expression system.The presence of the fusion protein in the crude extract can be measuredby conventional means such as running on an SDS gel, coomassie stainingand checking the appropriate band by densitometric measurement. Fusionproteins according to the invention are preferably at least 50% soluble,more preferably at least 70% soluble, most preferably 90% soluble orgreater as measured by the techniques described herein in the Examples.Techniques to improve solubility of recombinantly expressed proteins areknown, for example in prokaryotic expression systems solubility isimproved by lowering the temperature at which gene expression isinduced.

The fusion proteins described herein can be purified. In particular theycan be purified while remaining soluble or significantly soluble.

Immunogenic fragments as described herein will contain at least oneepitope of the antigen and display HIV antigenicity and are capable ofraising an immune response when presented in a suitable construct, suchas for example when fused to other HIV antigens or presented on acarrier, the immune response being directed against the native antigen.Typically the immunogenic fragments contain at least 20, preferably 50,more preferably 100 contiguous amino acids from the HIV antigen.

The invention provides in a further aspect polynucleotides encoding thepolypeptides according to the invention.

Polynucleotides according to the invention may be used as polynucleotidevaccines. The polynucleotides may be present within any of a variety ofdelivery systems known to those of ordinary skill in the art, includingnucleic acid expression systems such as plasmid DNA, bacterial and viralexpression systems. Numerous gene delivery techniques are well known inthe art, such as those described by Rolland, Crit. Rev. Therap. DrugCarrier Systems 15:143-198, 1998 and references cited therein.Appropriate nucleic acid expression systems contain the necessary DNAsequences for expression in the patient (such as a suitable promoter andterminating signal). When the expression system is a recombinant livemicroorganism, such as a virus or bacterium, the gene of interest can beinserted into the genome of the live recombinant virus or bacterium.Inoculation and in vivo infection with this live vector will lead to invivo expression of the antigen and induction of immune responses.Viruses and bacteria used for this purpose are for instance: poxviruses(e.g; vaccinia, fowlpox, canarypox, modified poxviruses e.g. ModifiedVirus Ankara (MVA)), alphaviruses (Sindbis virus, Semliki Forest Virus,Venezuelian Equine Encephalitis Virus), flaviviruses (yellow fevervirus, Dengue virus, Japanese encephalitis virus), adenoviruses,adeno-associated virus, picornaviruses (poliovirus, rhinovirus),herpesviruses (varicella zoster virus, etc), morbilliviruses (e.g.measles), Listeria, Salmonella, Shigella, Neisseria, BCG. These virusesand bacteria can be virulent, or attenuated in various ways in order toobtain live vaccines. Such live vaccines also form part of theinvention.

A preferred measles vector for use as a live vector according to theinvention is the Schwartz strain or a strain derived therefrom.

A preferred adenovirus for use as a live vector is a low sero-prevalenthuman adenovirus such as Ad5 or Ad35 or a non-human originatingadenovirus such as a non-human primate adenovirus such as a simianadenovirus. Such low sero-prevalent human or similar adenoviruses willhave less than 60, typically less than 50% sero-prevelance in thepopulation. Preferably the vectors are replication defective. Typicallythese viruses contain an E1 deletion and can be grown on cell lines thatare transformed with an E1 gene. Preferred simian adenoviruses areviruses isolated from chimpanzee. In particular C68 (also known as Pan9) (See U.S. Pat. No. 6,083,716) and Pan 5, 6 and Pan 7 (WO 03/046124)are preferred for use in the present invention. These vectors can bemanipulated to insert a heterologous polynucleotide according to theinvention such that the polypeptides according to the invention maybeexpressed. The use, formulation and manufacture of such recombinantadenoviral vectors is described in detail in WO 03/046142.

Thus, the Nef, p17 and p24 Gag and RT of a preferred vaccine accordingto the invention may be provided in the form of a polynucleotideencoding the desired polypeptide.

Polynucleotides according to the invention may be used to express theencoded polypeptides in a selected expression system. At least one ofthe HIV antigens, for example the RT, may be encoded by a codonoptimized sequence in the polynucleotide, that is to say the sequencehas been optimized for expression in a selected recombinant expressionsystem such as E. coli.

In another aspect the invention provides a p51 RT polypeptide orderivative thereof or a polynucleotide encoding it, preferably codonoptimized for expression in a suitable expression system, particularly aprokaryotic system such as E. coli.

The p51 RT polypeptide or polynucleotide may be used alone, or incombination with a polypeptide or polynucleotide construct according tothe invention. Thus in a further aspect the invention provides acomposition comprising (i) a polypeptide which comprises Nef or afragment containing a Nef epitope and p17 Gag and/or p24 Gag, whereinwhen both p17 and p24 Gag are present there is at least one HIV antigenor immunogenic fragment between them and (ii) a p51 RT polypeptide. Theinvention further provides polynucleotides encoding these.

According to this embodiment (i) may be selected from for example:

-   -   1. Nef-p17    -   2. Nef-p17 with linker    -   3. p17-Nef    -   4. p17-Nef with linker

Preferably Nef is full length Nef. Preferably p17 is full length p17.

The polypeptides and polynucleotides according to the invention may becombined with other antigens or polynucleotides encoding other antigens.In particular, this may include HIV env proteins or fragments orderivatives thereof. Preferred forms of env are gp120, gp140 and gp160.The env may be for example the envelope protein described in WO 00/07631from an HIV-1 Glade B envelope clone known as R2, or a fragment orderivative thereof. Thus the invention further provides a compositioncomprising any of the polypeptides or polypeptide compositions accordingto the invention, together with an HIV env protein or fragment orderivative thereof. Similarly the invention provides a compositioncomprising a polynucleotide or polynucleotides encoding a polypeptidesor polypeptides according to the invention and a polynucleotide encodingan HIV env protein or fragment or derivative thereof.

The invention further provides methods of preparing the polypeptidesdescribed herein which method comprises expressing a polynucleotideencoding the polypeptide in a suitable expression system, particularly aprokaryotic system such as E. coli and recovering the expressedpolypeptide. Preferably expression is induced at a low temperature, thatis a temperature below 37°, to promote the solubility of thepolypeptide.

The invention further provides a process for purifying a polypeptide asdescribed herein, which process comprises:

-   -   i). providing a composition comprising the unpurified        polypeptide;    -   ii). Subjecting the composition to at least two chromatographic        steps;    -   iii). Optionally carboxyamidating the polypeptide;

iv) Performing a buffer exchange step to provide the protein in asuitable buffer for a pharmaceutical formulation.

The carboxyamidation may be performed between the two chromatographicsteps. The carboxyamidation step may be performed using iodoacetimide.

In one example, the process according to the invention uses no more thantwo chromatographic steps.

The invention further provides pharmaceutical compositions andimmunogenic compositions and vaccines comprising the polypeptides andpolynucleotides according to the invention, in combination with apharmaceutically acceptable adjuvant or carrier.

Vaccines according to the invention may be used for prophylactic ortherapeutic immunization against HIV.

The invention further provides the use of the polypeptides andpolypeptide compositions and the polynucleotides and polynucleotidecompositions as described herein, in the manufacture of a vaccine forprophylactic or therapeutic immunization against HIV.

The vaccine of the present invention will contain an immunoprotective orimmunotherapeutic quantity of the polypeptide and/or polynucleotideantigens and may be prepared by conventional techniques.

Vaccine preparation is generally described in New Trends andDevelopments in Vaccines, edited by Voller et al., University ParkPress, Baltimore, Md., U.S.A. 1978. Encapsulation within liposomes isdescribed, for example, by Fullerton, U.S. Pat. No. 4,235,877.Conjugation of proteins to macromolecules is disclosed, for example, byLikhite, U.S. Pat. No. 4,372,945 and by Armor et al., U.S. Pat. No.4,474,757.

The amount of protein in the vaccine dose is selected as an amount whichinduces an immunoprotective response without significant, adverse sideeffects in typical vaccinees. Such amount will vary depending upon whichspecific immunogen is employed and the vaccination regimen that isselected. Generally, it is expected that each dose will comprise 1-1000μg of each protein, preferably 2-200 μg, most preferably 4-40 μg of thepolypeptide fusion. An optimal amount for a particular vaccine can beascertained by standard studies involving observation of antibody titresand other immune responses in subjects. Following an initialvaccination, subjects may receive a boost in about 4 weeks, and asubsequent second booster immunisation.

The proteins of the present invention are preferably adjuvanted in thevaccine formulation of the invention. Adjuvants are described in generalin Vaccine Design—the Subunit and Adjuvant Approach, edited by Powelland Newman, Plenum Press, New York, 1995.

Suitable adjuvants include an aluminium salt such as aluminium hydroxideor aluminium phosphate, but may also be a salt of calcium, iron or zinc,or may be an insoluble suspension of acylated tyrosine, or acylatedsugars, cationically or anionically derivatised polysaccharides, orpolyphosphazenes.

In the formulation of the invention it is preferred that the adjuvantcomposition induces a preferential Th1 response. However it will beunderstood that other responses, including other humoral responses, arenot excluded.

An immune response is generated to an antigen through the interaction ofthe antigen with the cells of the immune system. The resultant immuneresponse may be broadly distinguished into two extreme catagories, beinghumoral or cell mediated immune responses (traditionally characterisedby antibody and cellular effector mechanisms of protectionrespectively). These categories of response have been termed Th1-typeresponses (cell-mediated response), and Th2-type immune responses(humoral response).

Extreme Th1-type immune responses may be characterised by the generationof antigen specific, haplotype restricted cytotoxic T lymphocytes, andnatural killer cell responses. In mice Th1-type responses are oftencharacterised by the generation of antibodies of the IgG2a subtype,whilst in the human these correspond to IgG1 type antibodies. Th2-typeimmune responses are characterised by the generation of a broad range ofimmunoglobulin isotypes including in mice IgG1, IgA, and IgM.

It can be considered that the driving force behind the development ofthese two types of immune responses are cytokines, a number ofidentified protein messengers which serve to help the cells of theimmune system and steer the eventual immune response to either a Th1 orTh2 response. Thus high levels of Th1-type cytokines tend to favour theinduction of cell mediated immune responses to the given antigen, whilsthigh levels of Th2-type cytokines tend to favour the induction ofhumoral immune responses to the antigen.

It is important to remember that the distinction of Th1 and Th2-typeimmune responses is not absolute. In reality an individual will supportan immune response which is described as being predominantly Th1 orpredominantly Th2. However, it is often convenient to consider thefamilies of cytokines in terms of that described in murine CD4 +ve Tcell clones by Mosmann and Coffman (Mosmann, T. R. and Coffman, R. L.(1989) TH1 and TH2 cells: different patterns of lymphokine secretionlead to different functional properties. Annual Review of Immunology, 7,p145-173). Traditionally, Th1-type responses are associated with theproduction of the INF-γ and IL-2 cytokines by T-lymphocytes. Othercytokines often directly associated with the induction of Th1-typeimmune responses are not produced by T-cells, such as IL-12. Incontrast, Th2-type responses are associated with the secretion of IL-4,IL-5, IL-6, IL-10 and tumour necrosis factor-β (TNF-β).

It is known that certain vaccine adjuvants are particularly suited tothe stimulation of either Th1 or Th2-type cytokine responses.Traditionally the best indicators of the Th1:Th2 balance of the immuneresponse after a vaccination or infection includes direct measurement ofthe production of Th1 or Th2 cytokines by T lymphocytes in vitro afterrestimulation with antigen, and/or the measurement of the IgG1:IgG2aratio of antigen specific antibody responses.

Thus, a Th1-type adjuvant is one which stimulates isolated T-cellpopulations to produce high levels of Th1-type cytokines whenre-stimulated with antigen in vitro, and induces antigen specificimmunoglobulin responses associated with Th1-type isotype.

Preferred Th1-type immunostimulants which may be formulated to produceadjuvants suitable for use in the present invention include and are notrestricted to the following.

Monophosphoryl lipid A, in particular 3-de-O-acylated monophosphoryllipid A (3D-MPL), is a preferred Th1-type immunostimulant for use in theinvention. 3D-MPL is a well known adjuvant manufactured by RibiImmunochem, Montana. Chemically it is often supplied as a mixture of3-de-O-acylated monophosphoryl lipid A with either 4, 5, or 6 acylatedchains. It can be purified and prepared by the methods taught in GB2122204B, which reference also discloses the preparation of diphosphoryllipid A, and 3-O-deacylated variants thereof. Other purified andsynthetic lipopolysaccharides have been described (U.S. Pat. No.6,005,099 and EP 0 729 473 B1; Hilgers et al., 1986, Int. Arch. Allergy.Immunol., 79(4):392-6; Hilgers et al., 1987, Immunology, 60(1):141-6;and EP 0 549 074 B1). A preferred form of 3D-MPL is in the form of aparticulate formulation having a small particle size less than 0.2 μm indiameter, and its method of manufacture is disclosed in EP 0 689 454.

Saponins are also preferred Th1 immunostimulants in accordance with theinvention. Saponins are well known adjuvants and are taught in:Lacaille-Dubois, M and Wagner H. (1996. A review of the biological andpharmacological activities of saponins Phytomedicine vol 2 pp 363-386).For example, Quil A (derived from the bark of the South American treeQuillaja Saponaria Molina), and fractions thereof, are described in U.S.Pat. No. 5,057,540 and “Saponins as vaccine adjuvants”, Kensil, C. R.,Crit. Rev Ther Drug Carrier Syst, 1996, 12 (1-2):1-55; and EP 0 362 279B1. The haemolytic saponins QS21 and QS17 (HPLC purified fractions ofQuil A) have been described as potent systemic adjuvants, and the methodof their production is disclosed in U.S. Pat. No. 5,057,540 and EP 0 362279 B1. Also described in these references is the use of QS7 (anon-haemolytic fraction of Quil-A) which acts as a potent adjuvant forsystemic vaccines. Use of QS21 is further described in Kensil et al.(1991. J. Immunology vol 146, 431-437). Combinations of QS21 andpolysorbate or cyclodextrin are also known (WO 99/10008). Particulateadjuvant systems comprising fractions of QuilA, such as QS21 and QS7 aredescribed in WO 96/33739 and WO 96/11711. One such system is known as anIscorn and may contain one or more saponins.

Another preferred immunostimulant is an immunostimulatoryoligonucleotide containing unmethylated CpG dinucleotides (“CpG”). CpGis an abbreviation for cytosine-guanosine dinucleotide motifs present inDNA. CpG is known in the art as being an adjuvant when administered byboth systemic and mucosal routes (WO 96/02555, EP 468520, Davis et al.,J. Immunol, 1998, 160(2):870-876; McCluskie and Davis, J. Immunol.,1998, 161(9):4463-6). Historically, it was observed that the DNAfraction of BCG could exert an anti-tumour effect. In further studies,synthetic oligonucleotides derived from BCG gene sequences were shown tobe capable of inducing immunostimulatory effects (both in vitro and invivo). The authors of these studies concluded that certain palindromicsequences, including a central CG motif, carried this activity. Thecentral role of the CG motif in immunostimulation was later elucidatedin a publication by Krieg, Nature 374, p546 1995. Detailed analysis hasshown that the CG motif has to be in a certain sequence context, andthat such sequences are common in bacterial DNA but are rare invertebrate DNA. The immunostimulatory sequence is often: Purine, Purine,C, G, pyrimidine, pyrimidine; wherein the CG motif is not methylated,but other unmethylated CpG sequences are known to be immunostimulatoryand may be used in the present invention.

In certain combinations of the six nucleotides a palindromic sequence ispresent. Several of these motifs, either as repeats of one motif or acombination of different motifs, can be present in the sameoligonucleotide. The presence of one or more of these immunostimulatorysequences containing oligonucleotides can activate various immunesubsets, including natural killer cells (which produce interferon γ andhave cytolytic activity) and macrophages (Wooldrige et al Vol 89 (no.8), 1977). Other unmethylated CpG containing sequences not having thisconsensus sequence have also now been shown to be immunomodulatory.

CpG when formulated into vaccines, is generally administered in freesolution together with free antigen (WO 96/02555; McCluskie and Davis,supra) or covalently conjugated to an antigen (WO 98/16247), orformulated with a carrier such as aluminium hydroxide ((Hepatitissurface antigen) Davis et al. supra; Brazolot-Millan et al., Proc. Natl.Acad. Sci., USA, 1998, 95(26), 15553-8).

Such immunostimulants as described above may be formulated together withcarriers, such as for example liposomes, oil in water emulsions, and ormetallic salts, including aluminium salts (such as aluminium hydroxide).For example, 3D-MPL may be formulated with aluminium hydroxide (EP 0 689454) or oil in water emulsions (WO 95/17210); QS21 may be advantageouslyformulated with cholesterol containing liposomes (WO 96/33739), oil inwater emulsions (WO 95/17210) or alum (WO 98/15287); CpG may beformulated with alum (Davis et al. supra; Brazolot-Millan supra) or withother cationic carriers.

Combinations of immunostimulants are also preferred, in particular acombination of a monophosphoryl lipid A and a saponin derivative (WO94/00153; WO 95/17210; WO 96/33739; WO 98/56414; WO 99/12565; WO99/11241), more particularly the combination of QS21 and 3D-MPL asdisclosed in WO 94/00153. Alternatively, a combination of CpG plus asaponin such as QS21 also forms a potent adjuvant for use in the presentinvention. Alternatively the saponin may be formulated in a liposome orin an Iscorn and combined with an immunostimulatory oligonucleotide.

Thus, suitable adjuvant systems include, for example, a combination ofmonophosphoryl lipid A, preferably 3D-MPL, together with an aluminiumsalt.

An enhanced system involves the combination of a monophosphoryl lipid Aand a saponin derivative particularly the combination of QS21 and 3D-MPLas disclosed in WO 94/00153, or a less reactogenic composition where theQS21 is quenched in cholesterol containing liposomes (DQ) as disclosedin WO 96/33739. This combination may additionally comprise animmunostimulatory oligonucleotide.

A particularly potent adjuvant formulation involving QS21, 3D-MPL &tocopherol in an oil in water emulsion is described in WO 95/17210 andis another preferred formulation for use in the invention.

Another preferred formulation comprises a CpG oligonucleotide alone ortogether with an aluminium salt.

In a further aspect of the present invention there is provided a methodof manufacture of a vaccine formulation as herein described, wherein themethod comprises admixing a polypeptide according to the invention witha suitable adjuvant.

Particularly preferred adjuvant combinations for use in the formulationsaccording to the invention are as follows:

i) 3D-MPL+QS21 in a liposome

ii) Alum+3D-MPL

iii) Alum+QS21 in a liposome+3D-MPL

iv) Alum+CpG

v) 3D-MPL+QS21+oil in water emulsion

vi) CpG

Administration of the pharmaceutical composition may take the form ofone or of more than one individual dose, for example as repeat doses ofthe same polypeptide containing composition, or in a heterologous“prime-boost” vaccination regime. A heterologous prime-boost regime usesadministration of different forms of vaccine in the prime and the boost,each of which may itself include two or more administrations. Thepriming composition and the boosting composition will have at least oneantigen in common, although it is not necessarily an identical form ofthe antigen, it may be a different form of the same antigen.

Prime boost immunisations according to the invention may be performedwith a combination of protein and DNA-based formulations. Such astrategy is considered to be effective in inducing broad immuneresponses. Adjuvanted protein vaccines induce mainly antibodies and Thelper immune responses, while delivery of DNA as a plasmid or a livevector induces strong cytotoxic T lymphocyte (CTL) responses. Thus, thecombination of protein and DNA vaccination will provide for a widevariety of immune responses. This is particularly relevant in thecontext of HIV, since both neutralising antibodies and CTL are thoughtto be important for the immune defense against HIV.

In accordance with the invention a schedule for vaccination may comprisethe sequential (“prime-boost”) or simultaneous administration ofpolypeptide antigens and DNA encoding the polypeptides. The DNA may bedelivered as naked DNA such as plasmid DNA or in the form of arecombinant live vector, e.g. a poxvirus vector, an adenovirus vector, ameasles virus vector or any other suitable live vector. Protein antigensmay be injected once or several times followed by one or more DNAadministrations, or DNA may be used first for one or moreadministrations followed by one or more protein immunisations.

A particular example of prime-boost immunisation according to theinvention involves priming with DNA in the form of a recombinant livevector such as a modified poxvirus vector, for example Modified VirusAnkara (MVA) or an alphavirus, for example Venezuelian EquineEncephalitis Virus, or an adenovirus vector, or a measles virus vector,followed by boosting with a protein, preferably an adjuvanted protein.

Thus the invention further provides a pharmaceutical kit comprising:

-   -   a) a composition comprising a polypeptide comprising Nef or an        immunogenic fragment or derivative thereof and p17 and/or p24        Gag or an immunogenic fragment or derivative thereof, wherein        when both p17 and p24 Gag are present there is at least one HIV        antigen or immunogenic fragment or derivative between them,        together with a pharmaceutically acceptable excipient; and    -   b) a composition comprising a polynucleotide encoding one or        more of Nef and Gag or an immunogenic fragment or derivative of        Nef or Gag containing a Nef or Gag epitope present in the        polypeptide of a), together with a pharmaceutically acceptable        excipient.

Preferably the polypeptide of a) further comprises RT or an immunogenicfragment or derivative thereof such as p51RT.

In an alternative embodiment the pharmaceutical kit comprises:

-   -   a) a composition comprising a polynucleotide encoding a        polypeptide comprising Nef or an immunogenic fragment or        derivative thereof and p17 and/or p24 Gag or an immunogenic        fragment or derivative thereof, wherein when both p17 and p24        Gag are present there is at least one HIV antigen or immunogenic        fragment or derivative between them, together with a        pharmaceutically acceptable excipient; and    -   b) a composition comprising a polypeptide comprising one or more        of Nef and Gag or an immunogenic fragment or derivative of Nef        or Gag containing a Nef or Gag epitope present in the        polypeptide of a), together with a pharmaceutically acceptable        excipient.

Preferably the polynucleotide of a) encodes a polypeptide which furthercomprises RT or an immunogenic fragment or derivative thereof such asp51RT.

Preferred polypeptides and polynucleotides for use in a prime/boost kitaccording to the invention are the polypeptides and polynucleotides asdescribed herein. Thus, the protein component of a protein/DNA typeprime boost approach may be any of the preferred fusion proteinsdescribed herein. Likewise, the DNA component may be a polynucleotideencoding any of the preferred proteins.

Thus for example, the p24-RT-Nef-p17, p24-RT*-Nef-p17, p24-p51RT-Nef-p17, p24-p51RT*-Nef-p17, p17-p51RT-Nef or p17-p51RT*-Nef fusions or anyof the p17-Nef fusions as described herein may be provided in a primeboost kit wherein the priming composition comprises the fusion proteinand the boosting composition comprises a polynucleotide encoding thefusion protein, or the priming composition comprises the polynucleotideand the boosting composition comprises the fusion protein.

Both the priming composition and the boosting composition may bedelivered in more than one dose. Furthermore the initial priming andboosting doses may be followed up with further doses which may bealternated to result in e.g. a DNA plasmid prime/protein boost/furtherDNA plasmid dose/further protein dose.

By codon optimisation it is meant that the polynucleotide sequence, isoptimised to resemble the codon usage of genes in the desired expressionsystem, for example a prokaryotic system such as E. coli. In particular,the codon usage in the sequence is optimised to resemble that of highlyexpressed E. coli genes.

The purpose of codon optimizing for expression in a recombinant systemaccording to the invention is twofold: to improve expression levels ofthe recombinant product and to render expression products morehomogeneous (obtain a more homogeneous expression pattern). Improvedhomogeneity means that there are fewer irrelevant expression productssuch as truncates. Codon usage adaptation to E. coli expression can alsoeliminate the putative “frame-shift” sequences as well as prematuretermination and/or internal initiation sites.

The DNA code has 4 letters (A, T, C and G) and uses these to spell threeletter “codons” which represent the amino acids the proteins encoded inan organism's genes. The linear sequence of codons along the DNAmolecule is translated into the linear sequence of amino acids in theprotein(s) encoded by those genes. The code is highly degenerate, with61 codons coding for the 20 natural amino acids and 3 codonsrepresenting “stop” signals. Thus, most amino acids are coded for bymore than one codon—in fact several are coded for by four or moredifferent codons.

Where more than one codon is available to code for a given amino acid,it has been observed that the codon usage patterns of organisms arehighly non-random. Different species show a different bias in theircodon selection and, furthermore, utilisation of codons may be markedlydifferent in a single species between genes which are expressed at highand low levels. This bias is different in viruses, plants, bacteria andmammalian cells, and some species show a stronger bias away from arandom codon selection than others. For example, humans and othermammals are less strongly biased than certain bacteria or viruses. Forthese reasons, there is a significant probability that a viral gene froma mammalian virus expressed in E. coli, or a foreign or recombinant geneexpressed in mammalian cells will have an inappropriate distribution ofcodons for efficient expression. It is believed that the presence in aheterologous DNA sequence of clusters of codons or an abundance ofcodons which are rarely observed in the host in which expression is tooccur, is predictive of low heterologous expression levels in that host.

In the polynucleotides of the present invention, the codon usage patternmay thus be altered from that typical of human immunodeficiency virusesto more closely represent the codon bias of the target organism, e.g. E.coli.

There are a variety of publicly available programs useful for codonoptimization, for example “CalcGene” (Hale and Thompson, ProteinExpression and Purification 12: 185-189 (1998).

EXAMPLES Example 1 Construction and Expression of HIV-1 p24-RT-Nef-p17Fusion F4 and F4 Codon Optimized (co) 1. F4 Non-Codon-Optimised

HIV-1 gag p24 (capsid protein) and p17 (matrix protein), the reversetranscriptase and Nef proteins were expressed in E. coli B834 strain(B834 (DE3) is a methionine auxotroph parent of BL21 (DE3)), under thecontrol of the bacteriophage T7 promoter (pET expression system).

They were expressed as a single fusion protein containing the completesequence of the four proteins. Mature p24 coding sequence comes fromHIV-1 BH10 molecular clone, mature p17 sequence and RT gene from HXB2and Nef gene from the BRU isolate.

After induction, recombinant cells expressed significant levels of thep24-RT-Nef-p17 fusion that amounted to 10% of total protein.

When cells were grown and induced at 22° C., the p24-RT-Nef-p17 fusionprotein was confined mainly to the soluble fraction of bacterial lysates(even after freezing/thawing). When grown at 30° C., around 30% of therecombinant protein was associated with the insoluble fraction.

The fusion protein p24-RT-Nef-p17 is made up of 1136 amino acids with amolecular mass of approximately 129 kDa. The full-length proteinmigrates to about 130 kDa on SDS gels. The protein has a theoreticalisoeleectric point (pI) of 7.96 based on its amino acid sequence,confirmed by 2D-gel electrophoresis.

Details of the Recombinant Plasmid:

name: pRIT15436 (or lab name pET28b/p24-RT-Nef-p17)

-   -   host vector: pET28b    -   replicon: colE1        selection: kanamycin        promoter: T7        insert: p24-RT-Nef-p17 fusion gene.

Details of the Recombinant Protein:

p24-RT-Nef-p17 fusion protein: 1136 amino acids.N-term-p24: 232a.a.-hinge:2a.a.-RT: 562a.a.-hinge:2a.a.-Nef:206a.a.-P17: 132a.a.-C-term

Nucleotide and Amino-Acid Sequences: Nucleotide Sequence

[SEQ ID NO: 1]atggttatcgtgcagaacatccaggggcaaatggtacatcaggccatatcacctagaactttaaatgcatgggtaaaagtagtagaagagaaggctttcagcccagaagtaatacccatgttttcagcattatcagaaggagccaccccacaagatttaaacaccatgctaaacacagtggggggacatcaagcagccatgcaaatgttaaaagagaccatcaatgaggaagctgcagaatgggatagagtacatccagtgcatgcagggcctattgcaccaggccagatgagagaaccaaggggaagtgacatagcaggaactactagtacccttcaggaacaaataggatggatgacaaataatccacctatcccagtaggagaaatttataaaagatggataatcctgggattaaataaaatagtaagaatgtatagccctaccagcattctggacataagacaaggaccaaaagaaccttttagagactatgtagaccggttctataaaactctaagagccgagcaagcttcacaggaggtaaaaaattggatgacagaaaccttgttggtccaaaatgcgaacccagattgtaagactattttaaaagcattgggaccagcggctacactagaagaaatgatgacagcatgtcagggagtaggaggacccggccataaggcaagagttttg

ggccccattagccctattgagactgtgtcagtaaaattaaagccaggaatggatggcccaaaagttaaacaatggccattgacagaagaaaaaataaaagcattagtagaaatttgtacagagatggaaaaggaagggaaaatttcaaaaattgggcctgaaaatccatacaatactccagtatttgccataaagaaaaaagacagtactaaatggagaaaattagtagatttcagagaacttaataagagaactcaagacttctgggaagttcaattaggaataccacatcccgcagggttaaaaaagaaaaaatcagtaacagtactggatgtgggtgatgcatatttttcagttcccttagatgaagacttcaggaaatatactgcatttaccatacctagtataaacaatgagacaccagggattagatatcagtacaatgtgcttccacagggatggaaaggatcaccagcaatattccaaagtagcatgacaaaaatcttagagccttttagaaaacaaaatccagacatagttatctatcaatacatggatgatttgtatgtaggatctgacttagaaatagggcagcatagaacaaaaatagaggagctgagacaacatctgttgaggtggggacttaccacaccagacaaaaaacatcagaaagaacctccattccttaaaatgggttatgaactccatcctgataaatggacagtacagcctatagtgctgccagaaaaagacagctggactgtcaatgacatacagaagttagtggggaaattgaattgggcaagtcagatttacccagggattaaagtaaggcaattatgtaaactccttagaggaaccaaagcactaacagaagtaataccactaacagaagaagcagagctagaactggcagaaaacagagagattctaaaagaaccagtacatggagtgtattatgacccatcaaaagacttaatagcagaaatacagaagcaggggcaaggccaatggacatatcaaatttatcaagagccatttaaaaatctgaaaacaggaaaatatgcaagaatgaggggtgcccacactaatgatgtaaaacaattaacagaggcagtgcaaaaaataaccacagaaagcatagtaatatggggaaagactcctaaatttaaactgcccatacaaaaggaaacatgggaaacatggtggacagagtattggcaagccacctggattcctgagtgggagtttgttaatacccctcctttagtgaaattatggtaccagttagagaaagaacccatagtaggagcagaaaccttctatgtagatggggcagctaacagggagactaaattaggaaaagcaggatatgttactaatagaggaagacaaaaagttgtcaccctaactgacacaacaaatcagaagactgagttacaagcaatttatctagctttgcaggattcgggattagaagtaaacatagtaacagactcacaatatgcattaggaatcattcaagcacaaccagatcaaagtgaatcagagttagtcaatcaaataatagagcagttaataaaaaaggaaaaggtctatctggcatgggtaccagcacacaaaggaattggaggaaatgaacaagtagataaattagtcagtgctggaatcaggaaagtgcta

ggtggcaagtggtcaaaaagtagtgtggttggatggcctactgtaagggaaagaatgagacgagctgagccagcagcagatggggtgggagcagcatctcgagacctggaaaaacatggagcaatcacaagtagcaatacagcagctaccaatgctgcttgtgcctggctagaagcacaagaggaggaggaggtgggttttccagtcacacctcaggtacctttaagaccaatgacttacaaggcagctgtagatcttagccactttttaaaagaaaaggggggactggaagggctaattcactcccaacgaagacaagatatccttgatctgtggatctaccacacacaaggctacttccctgattggcagaactacacaccagggccaggggtcagatatccactgacctttggatggtgctacaagctagtaccagttgagccagataaggtagaagaggccaataaaggagagaacaccagcttgttacaccctgtgagcctgcatggaatggatgaccctgagagagaagtgttagagtggaggtttgacagccgcctagcatttcatcacgtggcccgagagctgcatccggagtacttcaagaactgc

atgggtgcgagagcgtcagtattaagcgggggagaattagatcgatgggaaaaaattcggttaaggccagggggaaagaaaaaatataaattaaaacatatagtatgggcaagcagggagctagaacgattcgcagttaatcctggcctgttagaaacatcagaaggctgtagacaaatactgggacagctacaaccatcccttcagacaggatcagaagaacttagatcattatataatacagtagcaaccctctattgtgtgcatcaaaggatagagataaaagacaccaaggaagctttagacaagatagaggaagagcaaaacaaaagtaagaaaaaagcacagcaagcagcagctgacacaggacacagcaatcaggtcagccaaaattactaa p24 sequence is in bold Nef sequence isunderlined Boxes: nucleotides introduced by genetic construction

Amino-Acid Sequence

[SEQ ID NO: 2] MVIVQNIQGQMVHQAISPRTLNAWVKVVEEKAFSPEVIPMFSALSEGATP 50QDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREP 100RGSDIAGTTSTLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTS 150ILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTETLLVQNANPDCK 200TILKALGPAATLEEMMTACQGVGGPGHKARVL

GPISPIETVSVKLKPG 250 MDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIGPENPYNTPVFAIKK300 KDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAY 350FSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMT 400KILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLRWGLT 450 TPDKKHQKEPPFL

MGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLN 500WASQIYPGIKVRQLCKLLRGTKALTEVIPLTEEAELELAENREILKEPVH 550GVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAHTNDV 600KQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQATWIPEWE 650FVNTPPLVKLWYQLEKEPIVGAETFYVDGAANRETKLGKAGYVTNRGRQK 700VVTLTDTTNQKTELQAIYLALQDSGLEVNIVTDSQYALGIIQAQPDQSES 750ELVNQIIEQLIKKEKVYLAWVPAHKGIGGNEQVDKLVSAGIRKV

MGGK 800 WSKSSVVGWPTVRERMRRAEPAADGVGAASRDLEKHGAITSSNTAATNAA 850CAWLEAQEEEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLEGLIHSQ 900RRQDILDLWIYHTQGYFPDWQNYTPGPGVRYPLTFGWCYKLVPVEPDKVE 950EANKGENTSLLHPVSLHGMDDPEREVLEWRFDSRLAFHHVARELHPEYFK 1000 NC

MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAV 1050NPGLLETSEGCRQILGQLQPSLQTGSEELRSLYNTVATLYCVHQRIEIKD 1100TKEALDKIEEEQNKSKKKAQQAAADTGHSNQVSQNY 1136 P24 sequence: amino-acids1-232 (in bold) RT sequence: amino-acids 235-795 Nef sequence:amino-acids 798-1002 P17 sequence: amino-acids 1005-1136 Boxes:amino-acids introduced by genetic construction K (Lysine): instead ofTryptophan (W). Mutation introduced to remove enzyme activity.

Expression of the Recombinant Protein:

In pET plasmid, the target gene (p24-RT-Nef-p17) is under control of thestrong bacteriophage T7 promoter. This promoter is not recognized by E.coli RNA polymerase and is dependent on a source of T7 RNA polymerase inthe host cell. B834 (DE3) host cell contains a chromosomal copy of theT7 RNA polymerase gene under lacUV5 control and expression is induced bythe addition of IPTG to the bacterial culture.

Pre-cultures were grown, in shake flasks, at 37° C. to mid-log phase(A620:0.6) and then stored at 4° C. overnight (to avoid stationary phasecultures). Cultures were grown in LBT medium supplemented with 1%glucose and 50 μg/ml kanamycin. Addition of glucose to the growth mediumhas the advantage to reduce the basal recombinant protein expression(avoiding cAMP mediated derepression of lacUV5 promoter)

Ten ml of cultures stored overnight at 4° C. were used to inoculate 200ml of LBT medium (without glucose) containing kanamycin. Cultures weregrown at 30° C. and 22° C. and when O.D.620 reached 0.6, IPTG was added(1 mM final). Cultures were incubated for further 3, 5 and 18 hours(overnight). Samples were collected before and after 3, 5 and 18 hoursinduction.

Extract Preparation was as Follows:

Cell pellets were suspended in breaking buffer* (at a theoretical O.D.of 10) and disrupted by four passages in French press (at 20.000 psi or1250 bars). Crude extracts (T) were centrifuged at 20.000 g for 30 minto separate the soluble (S) and insoluble (P) fractions.

*Breaking buffer: 50 mM Tris-HCL pH 8.0, 1 mM EDTA, 1 mM DTT+proteaseinhibitors cocktail (Complete/Boerhinger).

SDS-PAGE and Western Blot Analysis:

Fractions corresponding to insoluble pellet (P), supernatant (S) andcrude extract (T) were run on 10% reducing SDS-PAGE. p24-RT-Nef-p17recombinant was detected by Coomassie blue staining and on Western blot(WB).

Coomassie staining: p24-RT-Nef-p17 protein appears as:

-   -   one band at ±130 kDa (fitting with calculated MW)    -   MW theoretical: 128.970 Daltons    -   MW apparent: 130 kDa

Western Blot Analysis:

-   -   Reagents=Monoclonal antibody to RT (p66/p51)        -   Purchased from ABI (Advanced Biotechnologies) dilution:            1/5000        -   Alkaline phosphatase-conjugate anti-mouse antibody dilution:            1/7500

-   Expression level: Very strong p24-RT-Nef-p17 specific band after 20    h induction at 22° C., representing up to 10% of total protein (See    FIG. 1A).

Recombinant Protein “Solubility”:

“Fresh” cellular extracts (T, S, P fractions): With growth/induction at22° C./20 h, almost all p24-RT-Nef-p17 fusion protein is recovered inthe soluble fraction of cellular extract (FIG. 1A). Withgrowth/induction at 30° C./20 h, around 30% of p24-RT-Nef-p17 protein isassociated with the insoluble fraction (FIG. 1A).

“Freezing/Thawing” (S2, P2 Fractions):

Soluble (S1) fraction (20 h induction at 22° C.) conserved at −20° C.Thawed and centrifuged at 20.000 g/30 min: S2 and P2 (resuspended in1/10 vol.)

Breaking buffer with DTT: almost all p24-RT-Nef-p17 fusion protein stillsoluble (only 1-5% precipitated) (see FIG. 1B)

Breaking buffer without DTT: 85-90% of p24-RT-Nef-p17 still soluble(FIG. 1B)

Figures:

FIG. 1A—Coomassie staining and western blot.FIG. 1B—p24-RT-Nef-p17 solubility assay

The F4 protein was purified using purification method I in Example 7.

The cell growth and induction conditions and cellular extractspreparation for the examples which follow are as described in Example 1unless other conditions are specified (e.g. temperature, composition ofbreaking buffer).

2. F4 Codon-Optimised

The following polynucleotide sequence is codon optimized such that thecodon usage resembles the codon usage in a highly expressed gene in E.coli. The amino acid sequence is identical to that given above for F4non-codon optimized.

Nucleotide Sequence for F4co:

[SEQ ID NO: 3]atggtcattgttcagaacatacagggccaaatggtccaccaggcaattagtccgcgaactcttaatgcatgggtgaaggtcgtggaggaaaaggcattctccccggaggtcattccgatgttttctgcgctatctgagggcgcaacgccgcaagaccttaataccatgcttaacacggtaggcgggcaccaagccgctatgcaaatgctaaaagagactataaacgaagaggccgccgaatgggatcgagtgcacccggtgcacgccggcccaattgcaccaggccagatgcgcgagccgcgcgggtctgatattgcaggaactacgtctacccttcaggagcagattgggtggatgactaacaatccaccaatcccggtcggagagatctataagaggtggatcatactgggactaaacaagatagtccgcatgtattctccgacttctatactggatatacgccaaggcccaaaggagccgttcagggactatgtcgaccgattctataagacccttcgcgcagagcaggcatcccaggaggtcaaaaattggatgacagaaactcttttggtgcagaatgcgaatccggattgtaaaacaattttaaaggctctaggaccggccgcaacgctagaagagatgatgacggcttgtcagggagtcggtggaccggggcataaagcccgcgtctta

ggcccgatatctccgatagaaacagtttcggtcaagcttaaaccagggatggatggtccaaaggtcaagcagtggccgctaacggaagagaagattaaggcgctcgtagagatttgtactgaaatggagaaggaaggcaagataagcaagatcgggccagagaacccgtacaatacaccggtatttgcaataaagaaaaaggattcaacaaaatggcgaaagcttgtagattttagggaactaaacaagcgaacccaagacttttgggaagtccaactagggatcccacatccagccggtctaaagaagaagaaatcggtcacagtcctggatgtaggagacgcatattttagtgtaccgcttgatgaggacttccgaaagtatactgcgtttactataccgagcataaacaatgaaacgccaggcattcgctatcagtacaacgtgctcccgcagggctggaaggggtctccggcgatatttcagagctgtatgacaaaaatacttgaaccattccgaaagcagaatccggatattgtaatttaccaatacatggacgatctctatgtgggctcggatctagaaattgggcagcatcgcactaagattgaggaactgaggcaacatctgcttcgatggggcctcactactcccgacaagaagcaccagaaggagccgccgttcctaaagatgggctacgagcttcatccggacaagtggacagtacagccgatagtgctgcccgaaaaggattcttggaccgtaaatgatattcagaaactagtcggcaagcttaactgggcctctcagatttacccaggcattaaggtccgacagctttgcaagctactgaggggaactaaggctctaacagaggtcatcccattaacggaggaagcagagcttgagctggcagagaatcgcgaaattcttaaggagccggtgcacggggtatactacgacccctccaaggaccttatagccgagatccagaagcaggggcagggccaatggacgtaccagatatatcaagaaccgtttaagaatctgaagactgggaagtacgcgcgcatgcgaggggctcatactaatgatgtaaagcaacttacggaagcagtacaaaagattactactgagtctattgtgatatggggcaagaccccaaagttcaagctgcccatacagaaggaaacatgggaaacatggtggactgaatattggcaagctacctggattccagaatgggaatttgtcaacacgccgccacttgttaagctttggtaccagcttgaaaaggagccgatagtaggggcagagaccttctatgtcgatggcgccgcgaatcgcgaaacgaagctaggcaaggcgggatacgtgactaataggggccgccaaaaggtcgtaacccttacggataccaccaatcagaagactgaactacaagcgatttaccttgcacttcaggatagtggcctagaggtcaacatagtcacggactctcaatatgcgcttggcattattcaagcgcagccagatcaaagcgaaagcgagcttgtaaaccaaataatagaacagcttataaagaaagagaaggtatatctggcctgggtccccgctcacaagggaattggcggcaatgagcaagtggacaagctagtcagcgctgggattcgcaaggttctt

gggggtaagtggtctaagtctagcgtagtcggctggccgacagtccgcgagcgcatgcgacgcgccgaaccagccgcagatggcgtgggggcagcgtctagggatctggagaagcacggggctataacttccagtaacacggcggcgacgaacgccgcatgcgcatggttagaagcccaagaagaggaagaagtagggtttccggtaactccccaggtgccgttaaggccgatgacctataaggcagcggtggatctttctcacttccttaaggagaaaggggggctggagggcttaattcacagccagaggcgacaggatattcttgatctgtggatttaccatacccaggggtactttccggactggcagaattacaccccggggccaggcgtgcgctatcccctgactttcgggtggtgctacaaactagtcccagtggaacccgacaaggtcgaagaggctaataagggcgagaacacttctcttcttcacccggtaagcctgcacgggatggatgacccagaacgagaggttctagaatggaggttcgactctcgacttgcgttccatcacgtagcacgcgagctgcatccagaatatttcaagaactgc

atgggcgccagggccagtgtacttagtggcggagaactagatcgatgggaaaagatacgcctacgcccggggggcaagaagaagtacaagcttaagcacattgtgtgggcctctcgcgaacttgagcgattcgcagtgaatccaggcctgcttgagacgagtgaaggctgtaggcaaattctggggcagctacagccgagcctacagactggcagcgaggagcttcgtagtctttataataccgtcgcgactctctactgcgttcatcaacgaattgaaataaaggatactaaagaggcccttgataaaattgaggaggaacagaataagtcgaaaaagaaggcccagcaggccgccgccgacaccgggcacagcaaccaggtgtcccaaaactactaa

p24 sequence is in bold

Nef sequence is underlined

Boxes: nucleotides introduced by genetic construction

The procedures used in relation to F4 non-codon optimized were appliedfor the codon-optimised sequence.

Details of the Recombinant Plasmid:

-   -   name: pRIT15513 (lab name: pET28b/p24-RT-Nef-p17)    -   host vector: pET28b    -   replicon: colE1    -   selection: kanamycin    -   promoter: T7    -   insert: p24-RT-Nef-p17 fusion gene, codon-optimized

The F4 codon-optimised gene was expressed in E. coli BLR(DE3) cells, arecA⁻ derivative of B834(DE3) strain. RecA mutation prevents theputatitve production of lambda phages.

Pre-cultures were grown, in shake flasks, at 37° C. to mid-log phase(A₆₂₀:0.6) and then stored at 4° C. overnight (to avoid stationary phasecultures).

Cultures were grown in LBT medium supplemented with 1% glucose and 50μg/ml kanamycin. Addition of glucose to the growth medium has theadvantage to reduce the basal recombinant protein expression (avoidingcAMP mediated derepression of lacUV5 promoter).

Ten ml of cultures stored overnight at 4° C. were used to inoculate 200ml of LBT medium (without glucose) containing kanamycin. Cultures weregrown at 37° C. and when O.D.₆₂₀ reached 0.6, IPTG was added (1 mMfinal). Cultures were incubated for further 19 hours (overnight), at 22°C. Samples were collected before and 19 hours induction.

Extract Preparation was as Follows:

Cell pellets were resuspended in sample buffer (at a theoretical O.D. of10), boiled and directly loaded on SDS-PAGE.

SDS-PAGE and Western Blot Analysis:

Crude extracts samples were run on 10% reducing SDS-PAGE.

p24-RT-Nef-p17 recombinant protein is detected by Coomassie bluestaining (FIG. 2) and on Western blot.

-   -   Coomassie staining: p24-RT-Nef-p17 protein appears as:        -   one band at ±130 kDa (fitting with calculated MW)        -   MW theoretical: 128.967 Daltons        -   MW apparent: 130 kDa    -   Western blot analysis:    -   Reagents=Rabbit polyclonal anti RT (rabbit PO3L16) dilution:        1/10.000        -   Rabbit polyclonal anti Nef-Tat (rabbit 388) dilution            1/10,000        -   Alkaline phosphatase-conjugate anti-rabbit antibody            dilution: 1/7500

After induction at 22° C. over 19 hours, recombinant BLR(DE3) cellsexpressed the F4 fusion at a very high level ranging from 10-15% oftotal protein.

In comparison with F4 from the native gene, the F4 recombinant productprofile from the codon-optimised gene is slightly simplified. The majorF4-related band at 60 kDa, as well as minor bands below, disappeared(see FIG. 2). Compared to the B834(DE3) recombinant strain expressingF4, the BLR(DE3) strain producing F4co has the following advantages:higher production of F4 full-length protein, less complex band patternof recombinant product.

Example 2 Construction and Expression of P51 RT (Truncated,Codon-Optimised RT)

The RT/p66 region between amino acids 428-448 is susceptible to E. coliproteases. The P51 construct terminates at Leu 427 resulting in theelimination of RNaseH domain (see RT sequence alignment in FIG. 3).

The putative E. coli “frameshift” sequences identified in RT native genesequence were also eliminated (by codon-optimization of p51 gene).

p51 Synthetic Gene Design/Construction:

The sequence of the synthetic p51 gene was designed according to E. colicodon usage. Thus it was codon optimized such that the codon usageresembles the codon usage in a highly expressed gene in E. coli. Thesynthetic gene was constructed as follows: 32 oligonucleotides wereassembled in a single-step PCR. In a second PCR the full-length assemblywas amplified using the ends primers and the resulting PCR product wascloned into pGEM-T intermediate plasmid. After correction of pointerrors introduced during gene synthesis, the p51 synthetic gene wascloned into pET29a expression plasmid. This recombinant plasmid was usedto transform B834 (DE3) cells.

Recombinant Protein Characteristics: P51 RT Nucleotide Sequence

[SEQ ID NO: 4] atg

ggtccgatctctccgatagaaacagtttcggtcaagcttaaaccagggatg 60gatggtccaaaggtcaagcagtggccgctaacggaagagaagattaaggcgctcgtagag 120atttgtactgaaatggagaaggaaggcaagataagcaagatcgggccagagaacccgtac 180aatacaccggtatttgcaataaagaagaaggattcaacaaaatggcgaaagcttgtagat 240tttagggaactaaacaagcgaacccaagacttttgggaagtccaactaggtatcccacat 300ccagccggtctaaagaagaagaaatcggtcacagtcctggatgtaggagacgcatatttt 360agtgtaccgcttgatgaggacttccgaaagtatactgcgtttactataccgagcataaac 420aatgaaacgccaggcattcgctatcagtacaacgtgctcccgcagggctggaaggggtct 480ccggcgatatttcagagctctatgacaaaaatacttgaaccattccgaaagcagaatccg 540gatattgtaatttaccaatacatggacgatctctatgtgggctcggatctagaaattggg 600cagcatcgcactaagattgaggaactgaggcaacatctgcttcgatggggcctcactact 660cccgacaagaagcaccagaaggagccgccgttcctaaagatgggctacgagcttcatccg 720gacaagtggacagtacagccgatagtgctgcccgaaaaggattcttggaccgtaaatgat 780attcagaaactagtcggcaagcttaactgggcctctcagatttacccaggcattaaggtc 840cgacagctttgcaagctactgaggggaactaaggctctaacagaggtcatcccattaacg 900gaggaagcagagcttgagctggcagagaatcgcgaaattcttaaggagccggtgcacggg 960gtatactacgacccctccaaggaccttatagccgagatccagaagcaggggcagggccaa 1020tggacgtaccagatatatcaagaaccgtttaagaatctgaagactgggaagtacgcgcgc 1080atgcgaggggctcatactaatgatgtaaagcaacttacggaagcagtacaaaagattact 1140actgagtctattgtgatatggggcaagaccccaaagttcaagctgcccatacagaaggaa 1200acatgggaaacatggtggactgaatattggcaagctacctggattccagaatgggaattt 1260gtcaacacgccgccgctggtaaaactg

taa 1302 Boxes: amino-acids introduced by genetic construction

Amino-Acid Sequence:

[SEQ ID NO: 5] M

GPISPIETVSVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIGPENPY 60NTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDA 120 YFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNP 180DIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLRWGLTTPDKKHQKEPPFLKMGYELHP 240DKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYPGIKVRQLCKLLRGTKALTEVIPLT 300EEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYAR 360MRGAHTNDVKQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQATWIPEW 420 EFVNTPPLVKL

433 Boxes: amino-acids introduced by genetic construction. K (Lysine):instead of Tryptophan (W). Mutation introduced to remover enzymeactivity.

Length, Molecular Weight, Isoelectric Point (IP):

-   -   433 AA, MW: 50.3 kDa, IP: 9.08

p51 Expression in B834(DE3) Cells:

P51 expression level and recombinant protein solubility were evaluated,in parallel to RT/p66 production strain.

p51 Expression Level:

Induction condition: cells grown/induced at 37° C. (+1 mM IPTG), during5 hours.Breaking buffer: 50 mM Tris/HCl, pH: 7.5, 1 mM EDTA, +/−1 mM DTT.

Western Blot Analysis:

Reagents: rabbit polyclonal anti RT (rabbit PO3L16) (dilution: 1/10,000)

-   -   Alkaline phosphatase-conjugate anti-rabbit antibody (dilution:        1/7500)

Cellular fractions corresponding to crude extracts (T), insoluble pellet(P) and supernatant (S) were run on 10% reducing SDS-PAGE.

As illustrated on Coomassie stained gel and Western Blot (FIG. 4) veryhigh expression of P51 (15-20% of total protein) was observed, higherthan that observed for P66.

For both p51 and p66 proteins (after 5 h induction at 37° C.), 80% ofthe recombinant products were recovered in the soluble fraction (S1) ofcellular extracts (See FIG. 4). When expressed at 30° C., 99% ofrecombinant proteins were associated with the soluble fraction (data notshown).

The p51 Western Blot pattern was multiband, but less complex than thatobserved for P66.

Solubility Assay

Solubility assay: Freezing/thawing of Soluble (S1) fraction (5 hinduction, 37° C.) prepared under reducing (breaking buffer with DTT)and non-reducing conditions. After thawing, S1 samples were centrifugedat 20.000 g/30 minutes, generating S2 and P2 (p2 is resuspended in 1/10vol.).

After freezing/thawing of soluble fractions (S1), prepared underreducing as well as non-reducing conditions, 99% of p51 and p66 arestill recovered in soluble (S2) fraction. Only 1% is found in theprecipitate (P2). This is shown in FIG. 5.

Example 3 Construction and Expression of p17-Nef and Nef-p17 with orwithout Linker

The double fusion proteins were constructed with and without linkers.The linkers aimed to decrease potential interactions between the twofusion partners and are as follows:

Nef-

-P17 and p17- (SEQ ID NO: 20)

-Nef

Recombinant Plasmids Construction:

pET29a/Nef-p17 Expression Vector:

-   -   Nef-p17 fusion gene was amplified by PCR from the F4 recombinant        plasmid. The PCR product was cloned into the intermediate pGEM-T        cloning vector and subsequently into the pET29a expression        vector.

pET28b/p17-Nef Expression Vector:

-   -   Nef gene was amplified by PCR from the F4 recombinant plasmid.        The PCR product was cloned into the intermediate pGEM-T cloning        vector and subsequently into the pET28b/p17 expression vector,        as a C-terminal in frame fusion with the p17 gene.

pET29a/Nef-linker-p17 and pET28b/p17-linker-Nef Expression Vector:

-   -   A 18 by DNA fragment coding for the hexapeptide linker (GSGGGP;        SEQ ID NO:20) was inserted between Nef and p17 fusion partners,        by site-directed mutagenesis (using the “GeneTailor        Site-Directed Mutagenesis System”, Invitrogen).

Recombinant Protein Characteristics:

Length, Molecular Weight, Isoelectric Point (IP)

Nef-p17 (named NP): 340 AA, MW: 38.5 kDa, IP: 7.48 Nef-

-P17 (named NLP; Nef-SEQ ID NO: 20-P17): 346 AA, MW: 38.9 kDa, IP: 7.48p17-Nef (named PN; P17-SEQ ID NO: 20): 342 AA, MW: 38.7 kDa, IP: 7.19p17-

-Nef (named PLN; p17-SEQ ID NO: 20-Nef): 348 AA, MW: 39.1 kDa, IP: 7.19

Amino-Acid Sequences and Polynucleotide Sequences:

Nef-p17 nucleotide sequence

[SEQ ID NO: 6]atgggtggcaagtggtcaaaaagtagtgtggttggatggcctactgtaagggaaagaatg 60agacgagctgagccagcagcagatggggtgggagcagcatctcgagacctggaaaaacat 120ggagcaatcacaagtagcaatacagcagctaccaatgctgcttgtgcctggctagaagca 180caagaggaggaggaggtgggttttccagtcacacctcaggtacctttaagaccaatgact 240tacaaggcagctgtagatcttagccactttttaaaagaaaaggggggactggaagggcta 300attcactcccaacgaagacaagatatccttgatctgtggatctaccacacacaaggctac 360ttccctgattggcagaactacacaccagggccaggggtcagatatccactgacctttgga 420tggtgctacaagctagtaccagttgagccagataaggtagaagaggccaataaaggagag 480aacaccagcttgttacaccctgtgagcctgcatggaatggatgaccctgagagagaagtg 540ttagagtggaggtttgacagccgcctagcatttcatcacgtggcccgagagctgcatccg 600gagtacttcaagaactgcaggcctatgggtgcgagagcgtcagtattaagcgggggagaa 660ttagatcgatgggaaaaaattcggttaaggccagggggaaagaaaaaatataaattaaaa 720catatagtatgggcaagcagggagctagaacgattcgcagttaatcctggcctgttagaa 780acatcagaaggctgtagacaaatactgggacagctacaaccatcccttcagacaggatca 840gaagaacttagatcattatataatacagtagcaaccctctattgtgtgcatcaaaggata 900gagataaaagacaccaaggaagctttagacaagatagaggaagagcaaaacaaaagtaag 960aaaaaagcacagcaagcagcagctgacacaggacacagcaatcaggtcagccaaaattac 1020 gaa1023

Nef-p17 (NP)

[SEQ ID NO: 7]MGGKWSKSSVVGWPTVRERMRRAEPAADGVGAASRDLEKHGAITSSNTAATNAACAWLEA 60QEEEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLEGLIHSQRRQDILDLWIYHTQGY 120FPDWQNYTPGPGVRYPLTFGWCYKLVPVEPDKVEEANKGENTSLLHPVSLHGMDDPEREV 180LEWRFDSRLAFHHVARELHPEYFKNC

MGARASVLSGGELDRWEKIRLRPGGKKKYKLK 240HIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYNTVATLYCVHQRI 300EIKDTKEALDKIEEEQNKSKKKAQQAAADTGHSNQVSQNY 340 Box: amino-acids introducedby genetic construction. Nef sequence is in bold.

P17-Nef Nucleotide Sequence:

[SEQ ID NO: 8]atgggtgcgagagcgtcagtattaagcgggggagaattagatcgatgggaaaaaattcgg 60ttaaggccagggggaaagaaaaaatataaattaaaacatatagtatgggcaagcagggag 120ctagaacgattcgcagttaatcctggcctgttagaaacatcagaaggctgtagacaaata 180ctgggacagctacaaccatcccttcagacaggatcagaagaacttagatcattatataat 240acagtagcaaccctctattgtgtgcatcaaaggatagagataaaagacaccaaggaagct 300ttagacaagatagaggaagagcaaaacaaaagtaagaaaaaagcacagcaagcagcagct 360gacacaggacacagcaatcaggtcagccaaaattacctcgacaggcctatgggtggcaag 420tggtcaaaaagtagtgtggttggatggcctactgtaagggaaagaatgagacgagctgag 480ccagcagcagatggggtgggagcagcatctcgagacctggaaaaacatggagcaatcaca 540agtagcaatacagcagctaccaatgctgcttgtgcctggctagaagcacaagaggaggag 600gaggtgggttttccagtcacacctcaggtacctttaagaccaatgacttacaaggcagct 660gtagatcttagccactttttaaaagaaaaggggggactggaagggctaattcactcccaa 720cgaagacaagatatccttgatctgtggatctaccacacacaaggctacttccctgattgg 780cagaactacacaccagggccaggggtcagatatccactgacctttggatggtgctacaag 840ctagtaccagttgagccagataaggtagaagaggccaataaaggagagaacaccagcttg 900ttacaccctgtgagcctgcatggaatggatgaccctgagagagaagtgttagagtggagg 960tttgacagccgcctagcatttcatcacgtggcccgagagctgcatccggagtacttcaag 1020aactgctaa 1029

P17-Nef (PN)

[SEQ ID NO: 9]MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQI 60LGQLQPSLQTGSEELRSLYNTVATLYCVHQRIEIKDTKEALDKIEEEQNKSKKKAQQAAA 120DTGHSNQVSQNY

MGGKWSKSSVVGWPTVRERMRRAEPAADGVGAASRDLEKHGAIT 180SSNTAATNAACAWLEAQEEEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLEGLIHSQ 240RRQDILDLWIYHTQGYFPDWQNYTPGPGVRYPLTFGWCYKLVPVEPDKVEEANKGENTSL 300LHPVSLHGMDDPEREVLEWRFDSRLAFHHVARELHPEYFKNC 342 Box: amino-acidsintroduced by genetic construction. p17 sequence is in bold.

Nef-linker-p17 Nucleotide Sequence:

[SEQ ID NO: 10]atgggtggcaagtggtcaaaaagtagtgtggttggatggcctactgtaagggaaagaatg 60agacgagctgagccagcagcagatggggtgggagcagcatctcgagacctggaaaaacat 120ggagcaatcacaagtagcaatacagcagctaccaatgctgcttgtgcctggctagaagca 180caagaggaggaggaggtgggttttccagtcacacctcaggtacctttaagaccaatgact 240tacaaggcagctgtagatcttagccactttttaaaagaaaaggggggactggaagggcta 300attcactcccaacgaagacaagatatccttgatctgtggatctaccacacacaaggctac 360ttccctgattggcagaactacacaccagggccaggggtcagatatccactgacctttgga 420tggtgctacaagctagtaccagttgagccagataaggtagaagaggccaataaaggagag 480aacaccagcttgttacaccctgtgagcctgcatggaatggatgaccctgagagagaagtg 540ttagagtggaggtttgacagccgcctagcatttcatcacgtggcccgagagctgcatccg 600gagtacttcaagaactgcaggcctggatccggtggcggccctatgggtgcgagagcgtca 660gtattaagcgggggagaattagatcgatgggaaaaaattcggttaaggccagggggaaag 720aaaaaatataaattaaaacatatagtatgggcaagcagggagctagaacgattcgcagtt 780aatcctggcctgttagaaacatcagaaggctgtagacaaatactgggacagctacaacca 840tcccttcagacaggatcagaagaacttagatcattatataatacagtagcaaccctctat 900tgtgtgcatcaaaggatagagataaaagacaccaaggaagctttagacaagatagaggaa 960gagcaaaacaaaagtaagaaaaaagcacagcaagcagcagctgacacaggacacagcaat 1020caggtcagccaaaattactaa 1041

Nef-linker-p17 (NLP)

[SEQ ID NO: 11]MGGKWSKSSVVGWPTVRERMRRAEPAADGVGAASRDLEKHGAITSSNTAATNAACAWLEA 60QEEEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLEGLIHSQRRQDILDLWIYHTQGY 120FPDWQNYTPGPGVRYPLTFGWCYKLVPVEPDKVEEANKGENTSLLHPVSLHGMDDPEREV 180LEWRFDSRLAFHHVARELHPEYFKNC

GSGGGPMGARASVLSGGELDRWEKIRLRPGGK 240KKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYNTVATLY 300CVHQRIEIKDTKEALDKIEEEQNKSKKKAQQAAADTGHSNQVSQNY 346 Hexapeptide linkerBox: amino-acids introduced by genetic construction.P17-linker-Nef Nucleotide Sequence:

[SEQ ID NO: 12]atgggtgcgagagcgtcagtattaagcgggggagaattagatcgatgggaaaaaattcgg 60ttaaggccagggggaaagaaaaaatataaattaaaacatatagtatgggcaagcagggag 120ctagaacgattcgcagttaatcctggcctgttagaaacatcagaaggctgtagacaaata 180ctgggacagctacaaccatcccttcagacaggatcagaagaacttagatcattatataat 240acagtagcaaccctctattgtgtgcatcaaaggatagagataaaagacaccaaggaagct 300ttagacaagatagaggaagagcaaaacaaaagtaagaaaaaagcacagcaagcagcagct 360gacacaggacacagcaatcaggtcagccaaaattacctcgacaggcctggatccggtggc 420ggtcctatgggtggcaagtggtcaaaaagtagtgtggttggatggcctactgtaagggaa 480agaatgagacgagctgagccagcagcagatggggtgggagcagcatctcgagacctggaa 540aaacatggagcaatcacaagtagcaatacagcagctaccaatgctgcttgtgcctggcta 600gaagcacaagaggaggaggaggtgggttttccagtcacacctcaggtacctttaagacca 660atgacttacaaggcagctgtagatcttagccactttttaaaagaaaaggggggactggaa 720gggctaattcactcccaacgaagacaagatatccttgatctgtggatctaccacacacaa 780ggctacttccctgattggcagaactacacaccagggccaggggtcagatatccactgacc 840tttggatggtgctacaagctagtaccagttgagccagataaggtagaagaggccaataaa 900ggagagaacaccagcttgttacaccctgtgagcctgcatggaatggatgaccctgagaga 960gaagtgttagagtggaggtttgacagccgcctagcatttcatcacgtggcccgagagctg 1020catccggagtacttcaagaactgctaa 1047P17-linker-Nef (PLN)

[SEQ ID NO: 13]MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQI 60LGQLQPSLQTGSEELRSLYNTVATLYCVHQRIEIKDTKEALDKIEEEQNKSKKKAQQAAA 120DTGHSNQVSQNY

GSGGGPMGGKWSKSSVVGWPTVRERMRRAEPAADGVGAASRDLE 180KHGAITSSNTAATNAACAWLEAQEEEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLE 240GLIHSQRRQDILDLWIYHTQGYFPDWQNYTPGPGVRYPLTFGWCYKLVPVEPDKVEEANK 300GENTSLLHPVSLHGMDDPEREVLEWRFDSRLAFHHVARELHPEYFKNC 348 Hexapeptide linkerBox: amino-acids introduced by genetic construction.Comparative Expression of Nef-p17, p17-Nef Fusions, with and w/oLinkers:

The four recombinant strains were induced at 30° C. over 3 hours, inparallel to F4 and Nef producing strains. Crude extracts were preparedand analyzed by Coomassie stained gel and Western blotting.

Western Blot Analysis:

Reagents: rabbit polyclonal anti RT (rabbit PO3L16) (dilution: 1/10,000)

-   -   Alkaline phosphatase-conjugate anti-rabbit antibody (dilution:        1/7500)

As illustrated in FIG. 6, Nef-p 17 and p17-Nef fusions, with and w/olinker, are expressed at a high level (10% total proteins).

In the Western blot: the four double fusion constructs present amulti-band pattern, but less complex than what was observed for F4. Whenexpressed alone, the Nef and p17 proteins present single band patterns.

Strains expressing Nef-p17 (NP) and p17-Nef (PN) fusions, without linkerpeptide, were further analysed (solubility assays, see below).

Nef-p17 and p17-Nef Solubility Assay:

Nef-p17 and p17-Nef proteins were induced, in parallel to F4 and Nefproducing strains.

Induction condition: cells grown/induced at 30° C. (+1 mM IPTG), over 3hours.Breaking buffer: 50 mM Tris/HCl pH: 8, 50 mM NaCl, 1 mM EDTA

Fresh Cellular Extracts:

Cellular extracts were prepared (under non-reducing conditions) andfractions corresponding to crude extracts (T), insoluble pellet (P), andsupernatant (S1) were analyzed on Coomassie stained gel and Westernblot.

As illustrated in FIG. 7 on Coomassie stained gel and Western blot,almost all Nef-p17, p17-Nef, as well as Nef proteins are recovered inthe soluble fraction (S) of cellular extracts. For F4 construct: 5-10%of recombinant protein already recovered in the pellet fraction.

CONCLUSIONS

All double fusion constructs tested are highly expressed (>10% of totalprotein). P17-Nef and Nef-p17 fusion proteins are more soluble than F4.Both present a less complex WB pattern.

Example 4 Construction and Expression of p24-RT*-Nef-p17 (F4*)

F4* is a mutated version of the F4 (p24-RT/p66-Nef-p17) fusion where theMethionine at position 592 is replaced by a Lysine. This methionine is aputative internal transcriptional “start” site, as supported byN-terminal sequencing performed on a Q sepharose eluate sample of F4purification experiment. Indeed, the major F4-related small band at 62kDa present in the Q eluate sample starts at methionine 592.

Methionine is replaced by a lysine: RMR→RKR. The RKR motif is naturallypresent in Glade A RT sequences.

The impact of this mutation on CD4-CD8 epitopes was evaluated:

-   -   one HLA-A3 CTL epitope (A*3002) is lost, but 9 other HLA-A3        epitopes are present in the RT sequence.    -   No helper epitope identified in this region.

Recombinant Protein Characteristics:

Length, Molecular Weight, Isoelectric Point (IP):

-   -   1136 AA, 129 kDa, IP: 8.07

Nucleotide Sequence:

[SEQ ID NO: 14]atggttatcgtgcagaacatccaggggcaaatggtacatcaggccatatcacctagaactttaaatgcatgggtaaaagtagtagaagagaaggctttcagcccagaagtaatacccatgttttcagcattatcagaaggagccaccccacaagatttaaacaccatgctaaacacagtggggggacatcaagcagccatgcaaatgttaaaagagaccatcaatgaggaagctgcagaatgggatagagtacatccagtgcatgcagggcctattgcaccaggccagatgagagaaccaaggggaagtgacatagcaggaactactagtacccttcaggaacaaataggatggatgacaaataatccacctatcccagtaggagaaatttataaaagatggataatcctgggattaaataaaatagtaagaatgtatagccctaccagcattctggacataagacaaggaccaaaagaaccttttagagactatgtagaccggttctataaaactctaagagccgagcaagcttcacaggaggtaaaaaattggatgacagaaaccttgttggtccaaaatgcgaacccagattgtaagactattttaaaagcattgggaccagcggctacactagaagaaatgatgacagcatgtcagggagtaggaggacccggccataaggcaagagttttg

ggccccattagccctattgagactgtgtcagtaaaattaaagccaggaatggatggcccaaaagttaaacaatggccattgacagaagaaaaaataaaagcattagtagaaatttgtacagagatggaaaaggaagggaaaatttcaaaaattgggcctgaaaatccatacaatactccagtatttgccataaagaaaaaagacagtactaaatggagaaaattagtagatttcagagaacttaataagagaactcaagacttctgggaagttcaattaggaataccacatcccgcagggttaaaaaagaaaaaatcagtaacagtactggatgtgggtgatgcatatttttcagttcccttagatgaagacttcaggaaatatactgcatttaccatacctagtataaacaatgagacaccagggattagatatcagtacaatgtgcttccacagggatggaaaggatcaccagcaatattccaaagtagcatgacaaaaatcttagagccttttagaaaacaaaatccagacatagttatctatcaatacatggatgatttgtatgtaggatctgacttagaaatagggcagcatagaacaaaaatagaggagctgagacaacatctgttgaggtggggacttaccacaccagacaaaaaacatcagaaagaacctccattccttaaaatgggttatgaactccatcctgataaatggacagtacagcctatagtgctgccagaaaaagacagctggactgtcaatgacatacagaagttagtggggaaattgaattgggcaagtcagatttacccagggattaaagtaaggcaattatgtaaactccttagaggaaccaaagcactaacagaagtaataccactaacagaagaagcagagctagaactggcagaaaacagagagattctaaaagaaccagtacatggagtgtattatgacccatcaaaagacttaatagcagaaatacagaagcaggggcaaggccaatggacatatcaaatttatcaagagccatttaaaaatctgaaaacaggaaaatatgcacgtaaacgcggtgcccacactaatgatgtaaaacaattaacagaggcagtgcaaaaaataaccacagaaagcatagtaatatggggaaagactcctaaatttaaactgcccatacaaaaggaaacatgggaaacatggtggacagagtattggcaagccacctggattcctgagtgggagtttgttaatacccctcctttagtgaaattatggtaccagttagagaaagaacccatagtaggagcagaaaccttctatgtagatggggcagctaacagggagactaaattaggaaaagcaggatatgttactaatagaggaagacaaaaagttgtcaccctaactgacacaacaaatcagaagactgagttacaagcaatttatctagctttgcaggattcgggattagaagtaaacatagtaacagactcacaatatgcattaggaatcattcaagcacaaccagatcaaagtgaatcagagttagtcaatcaaataatagagcagttaataaaaaaggaaaaggtctatctggcatgggtaccagcacacaaaggaattggaggaaatgaacaagtagataaattagtcagtgctggaatcaggaaagtgcta

ggtggca agtggtcaaaaagtagtgtggttggatggcctactgtaagggaaagaatgagacgagctgagccagcagcagatggggtgggagcagcatctcgagacctggaaaaacatggagcaatcacaagtagcaatacagcagctaccaatgctgcttgtgcctggctagaagcacaagaggaggaggaggtgggttttccagtcacacctcaggtacctttaagaccaatgacttacaaggcagctgtagatcttagccactttttaaaagaaaaggggggactggaagggctaattcactcccaacgaagacaagatatccttgatctgtggatctaccacacacaaggctacttccctgattggcagaactacacaccagggccaggggtcagatatccactgacctttggatggtgctacaagctagtaccagttgagccagataaggtagaagaggccaataaaggagagaacaccagcttgttacaccctgtgagcctgcatggaatggatgaccctgagagagaagtgttagagtggaggtttgacagccgcctagcatttcatcacgtggcccgagagctgcatccggagtacttca agaactgc

atgggtgcgagagcgtcagtattaagcgggggagaattagatcgatgggaaaaaattcggttaaggccagggggaaagaaaaaatataaattaaaacatatagtatgggcaagcagggagctagaacgattcgcagttaatcctggcctgttagaaacatcagaaggctgtagacaaatactgggacagctacaaccatcccttcagacaggatcagaagaacttagatcattatataatacagtagcaaccctctattgtgtgcatcaaaggatagagataaaagacaccaaggaagctttagacaagatagaggaagagcaaaacaaaagtaagaaaaaagcacagcaagcagcagctgacacaggacacagcaatcaggtcagccaaaattactaa p24 sequence isin bold Nef sequence is underlined Boxes: nucleotides introduced bygenetic construction

Amino-Acid Sequence

[SEQ ID NO: 15] MVIVQNIQGQMVHQAISPRTLNAWVKVVEEKAFSPEVIPMFSALSEGATP 50QDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREP 100RGSDIAGTTSTLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTS 150ILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTETLLVQNANPDCK 200TILKALGPAATLEEMMTACQGVGGPGHKARVL

GPISPIETVSVKLKPG 250 MDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIGPENPYNTPVFAIKK300 KDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAY 350FSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMT 400KILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLRWGLT 450TPDKKHQKEPPFLKMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLN 500WASQIYPGIKVRQLCKLLRGTKALTEVIPLTEEAELELAENREILKEPVH 550GVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARKRGAHTNDV 600KQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQATWIPEWE 650FVNTPPLVKLWYQLEKEPIVGAETFYVDGAANRETKLGKAGYVTNRGRQK 700VVTLTDTTNQKTELQAIYLALQDSGLEVNIVTDSQYALGIIQAQPDQSES 750ELVNQIIEQLIKKEKVYLAWVPAHKGIGGNEQVDKLVSAGIRKV

MGGK 800 WSKSSVVGWPTVRERMRRAEPAADGVGAASRDLEKHGAITSSNTAATNAA 850CAWLEAQEEEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLEGLIHSQ 900RRQDILDLWIYHTQGYFPDWQNYTPGPGVRYPLTFGWCYKLVPVEPDKVE 950EANKGENTSLLHPVSLHGMDDPEREVLEWRFDSRLAFHHVARELHPEYFK 1000 NC

MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAV 1050NPGLLETSEGCRQILGQLQPSLQTGSEELRSLYNTVATLYCVHQRIEIKD 1100TKEALDKIEEEQNKSKKKAQQAAADTGHSNQVSQNY 1136 P24 sequence: amino-acids1-232 (in bold) RT sequence: amino-acids 235-795 Nef sequence:amino-acids 798-1002 P17 sequence: amino-acids 1005-1136 Boxes:amino-acids introduced by genetic construction K (Lysine): instead ofMethionine (internal “start” codon) K (Lysine)K: instead of Tryptophan(W). Mutation introduced to remover enzyme activity.

F4* Expression in B834(DE3) Cells:

F4* recombinant strain was induced at 22° C. during 18 h, in parallel toF4 non-mutated construct. Crude extracts were prepared and analyzed byCoomassie stained gel and Western blotting.

As illustrated in FIG. 8, F4* was expressed at a high level (10% totalprotein), slightly higher compared to F4 and the small 62 kDa banddisappeared.

Western Blot Analysis:

Reagents: pool 3 Mabs anti p24 (JC13.1, JC16.1, IG8.1.1)(dilution1/5000)

-   -   rabbit polyclonal anti RT (rabbit PO3L16) (dilution: 1/10 000)    -   rabbit polyclonal anti Nef-Tat (rabbit 388) (dilution 1/10 000)        -   Alkaline phosphatase-conjugate anti-rabbit antiboby            (dilution: 1/7500)        -   Alkaline phosphatase-conjugate anti-mouse antiboby            (dilution: 1/7500)

Example 5 Construction and Expression of F3 and F3* (Mutated F3)

F3 (p17-p51-Nef) and F3* (p17-p51*-Nef) in which the putative internalMethionine initiation site replaced by Lysine.

F3 and F3* fusions could be used in combination with p24.

Recombinant Plasmids Construction:

F3: The sequence encoding p51 was excized (as ScaI and StuI DNAfragment) from pET29a/p51 expression plasmid and ligated intopET28b/p17-Nef plasmid, at the StuI site (located between p17 and Nefgene), as an in frame fusion with p17 and Nef sequences. The resultingfusion construct p17-p51-Nef is named F3.

F3*: Mutation of the putative internal methionine initiation site wasachieved using the “Gene Tailor Site-Directed Mutagenesis system”(Invitrogen), generating F3* construct. F3 and F3* plasmids were used totransform B834 (DE3) cells.

Recombinant Protein Characteristics:

Length Molecular Weight, Isoelectric Point (IP)

-   -   770 AA, 88.5 kDa, IP:8.58

Nucleotide Sequence (for F3*)

[SEQ ID NO: 16]atgggtgcgagagcgtcagtattaagcgggggagaattagatcgatgggaaaaaattcgg 60ttaaggccagggggaaagaaaaaatataaattaaaacatatagtatgggcaagcagggag 120ctagaacgattcgcagttaatcctggcctgttagaaacatcagaaggctgtagacaaata 180ctgggacagctacaaccatcccttcagacaggatcagaagaacttagatcattatataat 240acagtagcaaccctctattgtgtgcatcaaaggatagagataaaagacaccaaggaagct 300ttagacaagatagaggaagagcaaaacaaaagtaagaaaaaagcacagcaagcagcagct 360gacacaggacacagcaatcaggtcagccaaaattacctcgac

GGTCCGATCTCT 420CCGATAGAAACAGTTTCGGTCAAGCTTAAACCAGGGATGGATGGTCCAAAGGTCAAGCAG 480TGGCCGCTAACGGAAGAGAAGATTAAGGCGCTCGTAGAGATTTGTACTGAAATGGAGAAG 540GAAGGCAAGATAAGCAAGATCGGGCCAGAGAACCCGTACAATACACCGGTATTTGCAATA 600AAGAAGAAGGATTCAACAAAATGGCGAAAGCTTGTAGATTTTAGGGAACTAAACAAGCGA 660ACCCAAGACTTTTGGGAAGTCCAACTAGGTATCCCACATCCAGCCGGTCTAAAGAAGAAG 720AAATCGGTCACAGTCCTGGATGTAGGAGACGCATATTTTAGTGTACCGCTTGATGAGGAC 780TTCCGAAAGTATACTGCGTTTACTATACCGAGCATAAACAATGAAACGCCAGGCATTCGC 840TATCAGTACAACGTGCTCCCGCAGGGCTGGAAGGGGTCTCCGGCGATATTTCAGAGCTCT 900ATGACAAAAATACTTGAACCATTCCGAAAGCAGAATCCGGATATTGTAATTTACCAATAC 960ATGGACGATCTCTATGTGGGCTCGGATCTAGAAATTGGGCAGCATCGCACTAAGATTGAG 1020GAACTGAGGCAACATCTGCTTCGATGGGGCCTCACTACTCCCGACAAGAAGCACCAGAAG 1080GAGCCGCCGTTCCTAAAGATGGGCTACGAGCTTCATCCGGACAAGTGGACAGTACAGCCG 1140ATAGTGCTGCCCGAAAAGGATTCTTGGACCGTAAATGATATTCAGAAACTAGTCGGCAAG 1200CTTAACTGGGCCTCTCAGATTTACCCAGGCATTAAGGTCCGACAGCTTTGCAAGCTACTG 1260AGGGGAACTAAGGCTCTAACAGAGGTCATCCCATTAACGGAGGAAGCAGAGCTTGAGCTG 1320GCAGAGAATCGCGAAATTCTTAAGGAGCCGGTGCACAGGGTATACTACGACCCCTCCAAG 1380GACCTTATAGCCGAGATCCAGAAGCAGGGGCAGGGCCAATGGACGTACCAGATATATCAA 1440GAACCGTTTAAGAATCTGAAGACTGGGAAGTACGCGCGCAAACGAGGGGCTCATACTAAT 1500GATGTAAAGCAACTTACGGAAGCAGTACAAAAGATTACTACTGAGTCTATTGTGATATGG 1560GGCAAGACCCCAAAGTTCAAGCTGCCCATACAGAAGGAAACATGGGAAACATGGTGGACT 1620GAATATTGGCAAGCTACCTGGATTCCAGAATGGGAATTTGTCAACACGCCGCCGCTGGTA 1680 AAACTG

ATGggtggcaagtggtcaaaaagtagtgtggttggatggcctactgta 1740agggaaagaatgagacgagctgagccagcagcagatggggtgggagcagcatctcgagac 1800ctggaaaaacatggagcaatcacaagtagcaatacagcagctaccaatgctgcttgtgcc 1860tggctagaagcacaagaggaggaggaggtgggttttccagtcacacctcaggtaccttta 1920agaccaatgacttacaaggcagctgtagatcttagccactttttaaaagaaaagggggga 1980ctggaagggctaattcactcccaacgaagacaagatatccttgatctgtggatctaccac 2040acacaaggctacttccctgattggcagaactacacaccagggccaggggtcagatatcca 2100ctgacctttggatggtgctacaagctagtaccagttgagccagataaggtagaagaggcc 2160aataaaggagagaacaccagcttgttacaccctgtgagcctgcatggaatggatgaccct 2220gagagagaagtgttagagtggaggtttgacagccgcctagcatttcatcacgtggcccga 2280gagctgcatccggagtacttcaagaactgctaa 2213 P17: sequence in bold P51:sequence in capital letter Nef: sequence in small letter Boxes:nucleotides introduced by genetic construction

Amino-Acid Sequence (for F3)

[SEQ ID NO: 17]MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQI 60LGQLQPSLQTGSEELRSLYNTVATLYCVHQRIEIKDTKEALDKIEEEQNKSKKKAQQAAA 120DTGHSNQVSQNY

GPISPIETVSVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEK 180EGKISKIGPENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKK 240KSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSS 300MTKILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLRWGLTTPDKKHQK 360EPPFLKMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQIYPGIKVRQLCKLL 420RGTKALTEVIPLTEEAELELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQ 480EPFKNLKTGKYARMRGAHTNDVKQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWT 540EYWQATWIPEWEFVNTPPLVKL

MGGKWSKSSVVGWPTVRERMRRAEPAADGVGAASRD 600LEKHGAITSSNTAATNAACAWLEAQEEEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGG 660LEGLIHSQRRQDILDLWIYHTQGYFPDWQNYTPGPGVRYPLTFGWCYKLVPVEPDKVEEA 720NKGENTSLLHPVSLHGMDDPEREVLEWRFDSRLAFHHVARELHPEYFKNC 770 P17 sequence:amino-acids 1-134 (in bold) P51 sequence: amino-acids 137-562 Nefsequence: amino-acids 565-770 Boxes: amino-acids introduced by geneticconstruction Methionine 494 replaced by Lysine (K) in F3* constructK(Lysine)K: instead of Tryptophan (W). Mutation introduced to removerenzyme activity.

F3 Expression in B834(DE3) Cells:

F3 expression level and recombinant protein solubility were evaluated,in parallel to F4 (p24-p66-Nef-p17) and p17-Nef (F2) production strains.

Induction condition: cells grown at 37° C./induced at 30° C. (+1 mMIPTG), during 3 h.Breaking buffers: F4: 50 mMTris/HCl pH: 8.0, 50 mM NaCl, 1 mM EDTA, +/−1mM DTT

-   -   F2: 50 mMTris/HCl pH: 8.0, 50 mM NaCl, 1 mM EDTA, without DTT    -   F3: 50 mMTris/HCl pH: 7.5, 50 mM NaCl, 1 mM EDTA, +/−1 mM DTT

Western Blot Analysis:

reagents rabbit polyclonal anti RT (rabbit PO3L16) (dilution: 1/10 000)

-   -   rabbit polyclonal anti Nef-Tat (rabbit 388) (dilution 1/10 000)    -   Alkaline phosphatase-conjugate anti-rabbit antibody (dilution:        1/7500)

“Fresh” Cellular Extracts

Cellular fractions corresponding to crude extracts (T), insoluble pellet(P) and supernatant (S) were analyzed on 10% reducing SDS-PAGE. Asillustrated in FIG. 9, the F3 fusion protein is expressed at a highlevel (10% total protein). Almost all F3 is recovered in the solublefraction (S) of cellular extracts, although 5-10% of F4 product arealready associated with the pellet fraction. The WB pattern issimplified compared to F4.

F3* Expression in B834(DE3) Cells:

F3* recombinant strain was induced at 37° C. over 3 h, in parallel to F3non-mutated constructed. Crude cellular extracts were prepared andanalyzed by Coomassie stained gel and Western blotting. As illustratedin FIG. 10, the F3* fusion protein is expressed at a very high level(10-20% total protein). There was a simplified WB pattern compared toF3; a very faint band at +/−32 kDa (detected on WB only) haddisappeared.

Example 6 Construction and Expression of F4(p51) and F4(p51)*

RT/p51 was used in the F4 fusion construct (in place of RT/p66).

F4(p51)=p24-p51-Nef-p17

F4(p51)*=p24-p51*-Nef-p17—Mutated F4(p51): putative internal Methionineinitiation site (present in RT portion) replaced by Lysine, to furthersimplify the antigen pattern.

Recombinant Plasmids Construction:

F4(p51): The sequence encoding p51 was amplified by PCR from pET29a/p51expression plasmid. Restriction sites were incorporated into the PCRprimers (NdeI and StuI at the 5′ end. AvrII at the 3′ end of the codingsequence). The PCR product was cloned into pGem-T intermediate plasmidand sequenced. pGem-T/p51 intermediate plasmid was restricted by NdeIand AvrII and the p51 fragment was ligated intopET28b/p24-RT/p66-Nef-p17 expression plasmid restricted by NdeI and NheI(resulting in the excision of RT/p66 sequence). Ligation was performedby combining digestion reactions in appropriate concentrations, in thepresence of T4 DNA ligase. Ligation product was used to transform DH5αE. coli cells. Verification of insertion of p51 into the correcttranslational reading frame (in place of RT/p66 in the f4 fusion) wasconfirmed by DNA sequencing. The resulting fusion constructp24-RT/p5′-Nef-p17 is named F4(p51).

F4(p51)*: Mutation of the putative internal methionine initiation site(present in RT/p51) was achieved with “GeneTailor Site-DirectedMutagenesis system” (Invitrogen), generating F4(p51)* construct.

F4(p51) and F4(p51)* expression plasmids were used to transformB834(DE3) cells.

Recombinant Proteins Characteristics:

Length, Molecular Weight, Isoelectric Point (IP):

-   -   1005 AA, 114.5 kDa, IP: 8.47

Nucleotide Sequence (for F4(p51)*)

[SEQ ID NO: 18]Atggttatcgtgcagaacatccaggggcaaatggtacatcaggccatatcacctagaact 60Ttaaatgcatgggtaaaagtagtagaagagaaggctttcagcccagaagtaatacccatg 120Ttttcagcattatcagaaggagccaccccacaagatttaaacaccatgctaaacacagtg 180Gggggacatcaagcagccatgcaaatgttaaaagagaccatcaatgaggaagctgcagaa 240Tgggatagagtacatccagtgcatgcagggcctattgcaccaggccagatgagagaacca 300Aggggaagtgacatagcaggaactactagtacccttcaggaacaaataggatggatgaca 360Aataatccacctatcccagtaggagaaatttataaaagatggataatcctgggattaaat 420Aaaatagtaagaatgtatagccctaccagcattctggacataagacaaggaccaaaagaa 480Ccttttagagactatgtagaccggttctataaaactctaagagccgagcaagcttcacag 540Gaggtaaaaaattggatgacagaaaccttgttggtccaaaatgcgaacccagattgtaag 600Actattttaaaagcattgggaccagcggctacactagaagaaatgatgacagcatgtcag 660Ggagtaggaggacccggccataaggcaagagttttg

GGTCCGATCTCT 720CCGATAGAAACAGTTTCGGTCAAGCTTAAACCAGGGATGGATGGTCCAAAGGTCAAGCAG 780TGGCCGCTAACGGAAGAGAAGATTAAGGCGCTCGTAGAGATTTGTACTGAAATGGAGAAG 840GAAGGCAAGATAAGCAAGATCGGGCCAGAGAACCCGTACAATACACCGGTATTTGCAATA 900AAGAAGAAGGATTCAACAAAATGGCGAAAGCTTGTAGATTTTAGGGAACTAAACAAGCGA 960ACCCAAGACTTTTGGGAAGTCCAACTAGGTATCCCACATCCAGCCGGTCTAAAGAAGAAG 1020AAATCGGTCACAGTCCTGGATGTAGGAGACGCATATTTTAGTGTACCGCTTGATGAGGAC 1080TTCCGAAAGTATACTGCGTTTACTATACCGAGCATAAACAATGAAACGCCAGGCATTCGC 1140TATCAGTACAACGTGCTCCCGCAGGGCTGGAAGGGGTCTCCGGCGATATTTCAGAGCTCT 1200ATGACAAAAATACTTGAACCATTCCGAAAGCAGAATCCGGATATTGTAATTTACCAATAC 1260ATGGACGATCTCTATGTGGGCTCGGATCTAGAAATTGGGCAGCATCGCACTAAGATTGAG 1320GAACTGAGGCAACATCTGCTTCGATGGGGCCTCACTACTCCCGACAAGAAGCACCAGAAG 1380GAGCCGCCGTTCCTAAAGATGGGCTACGAGCTTCATCCGGACAAGTGGACAGTACAGCCG 1440ATAGTGCTGCCCGAAAAGGATTCTTGGACCGTAAATGATATTCAGAAACTAGTCGGCAAG 1500CTTAACTGGGCCTCTCAGATTTACCCAGGCATTAAGGTCCGACAGCTTTGCAAGCTACTG 1560AGGGGAACTAAGGCTCTAACAGAGGTCATCCCATTAACGGAGGAAGCAGAGCTTGAGCTG 1620GCAGAGAATCGCGAAATTCTTAAGGAGCCGGTGCACAGGGTATACTACGACCCCTCCAAG 1680GACCTTATAGCCGAGATCCAGAAGCAGGGGCAGGGCCAATGGACGTACCAGATATATCAA 1740GAACCGTTTAAGAATCTGAAGACTGGGAAGTACGCGCGCAAACGAGGGGCTCATACTAAT 1800GATGTAAAGCAACTTACGGAAGCAGTACAAAAGATTACTACTGAGTCTATTGTGATATGG 1860GGCAAGACCCCAAAGTTCAAGCTGCCCATACAGAAGGAAACATGGGAAACATGGTGGACT 1920GAATATTGGCAAGCTACCTGGATTCCAGAATGGGAATTTGTCAACACGCCGCCGCTGGTA 1980 AAACTG

ATGggtggcaagtggtcaaaaagtagtgtggttggatggcctact 2040Gtaagggaaagaatgagacgagctgagccagcagcagatggggtgggagcagcatctcga 2100Gacctggaaaaacatggagcaatcacaagtagcaatacagcagctaccaatgctgcttgt 2160Gcctggctagaagcacaagaggaggaggaggtgggttttccagtcacacctcaggtacct 2220Ttaagaccaatgacttacaaggcagctgtagatcttagccactttttaaaagaaaagggg 2280Ggactggaagggctaattcactcccaacgaagacaagatatccttgatctgtggatctac 2340Cacacacaaggctacttccctgattggcagaactacacaccagggccaggggtcagatat 2400Ccactgacctttggatggtgctacaagctagtaccagttgagccagataaggtagaagag 2460Gccaataaaggagagaacaccagcttgttacaccctgtgagcctgcatggaatggatgac 2520Cctgagagagaagtgttagagtggaggtttgacagccgcctagcatttcatcacgtggcc 2580Cgagagctgcatccggagtacttcaagaactgc

ATGGGTGCGAGAGCGTCAGTA 2640TTAAGCGGGGGAGAATTAGATCGATGGGAAAAAATTCGGTTAAGGCCAGGGGGAAAGAAA 2700AAATATAAATTAAAACATATAGTATGGGCAAGCAGGGAGCTAGAACGATTCGCAGTTAAT 2760CCTGGCCTGTTAGAAACATCAGAAGGCTGTAGACAAATACTGGGACAGCTACAACCATCC 2820CTTCAGACAGGATCAGAAGAACTTAGATCATTATATAATACAGTAGCAACCCTCTATTGT 2880GTGCATCAAAGGATAGAGATAAAAGACACCAAGGAAGCTTTAGACAAGATAGAGGAAGAG 2940CAAAACAAAAGTAAGAAAAAAGCACAGCAAGCAGCAGCTGACACAGGACACAGCAATCAG 3000GTCAGCCAAAATTACtaa 3018 P24: sequence in bold P51: sequence in capitalletter Nef: sequence in small letter P17: sequence underlined Boxes:nucleotides introduced by genetic construction

Amino-Acid Sequence (for F4(p51)*)

[SEQ ID NO: 19]MVIVQNIQGQMVHQAISPRTLNAWVKVVEEKAFSPEVIPMFSALSEGATPQDLNTMLNTV 60GGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREPRGSDIAGTTSTLQEQIGWMT 120NNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQ 180EVKNWMTETLLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVL

GPIS 240 PIETVSVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIGPENPYNTPVFAI300 KKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDED 360FRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQY 420MDDLYVGSDLEIGQHRTKIEELRQHLLRWGLTTPDKKHQKEPPFLKMGYELHPDKWTVQP 480IVLPEKDSWTVNDIQKLVGKLNWASQIYPGIKVRQLCKLLRGTKALTEVIPLTEEAELEL 540AENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARKRGAHTN 600DVKQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLV 660 KL

MGGKWSKSSVVGWPTVRERMRRAEPAADGVGAASRDLEKHGAITSSNTAATNAAC 720AWLEAQEEEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLEGLIHSQRRQDILDLWIY 780HTQGYFPDWQNYTPGPGVRYPLTFGWCYKLVPVEPDKVEEANKGENTSLLHPVSLHGMDD 840PEREVLEWRFDSRLAFHHVARELHPEYFKNC

MGARASVLSGGELDRWEKIRLRPGGKK 900KYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYNTVATLYC 960VHQRIEIKDTKEALDKIEEEQNKSKKKAQQAAADTGHSNQVSQNY 1005 P24: amino-acids1-232 P51: amino-acids 237-662 Nef: amino-acids 666-871 P17: amino-acids874-1005 K (Lysine): instead of Methionine (internal “start” codon) K(Lysine) K: instead of Tryptophan (W). Mutation introduced to removeenzyme activity.F4(p51) Expression in B834(DE3) Cells:

F4(p51) expression level and recombinant protein solubility wereevaluated, in parallel to F4 expressing strain.

Induction condition: cells grown at 37° C./induced at 22° C. (+1 mMIPTG), over 19 h.Breaking buffer: 50 mMTris/HCl pH: 7.5, 1 mM EDTA, 1 mM DTT

Western Blot Analysis:

reagents rabbit polyclonal anti RT (rabbit PO3L16) (dilution: 1/10 000)

-   -   rabbit polyclonal anti Nef-Tat (rabbit 388) (dilution 1/10 000)    -   Alkaline phosphatase-conjugate anti-rabbit antiboby (dilution:        1/7500)

Cellular fractions corresponding to crude extracts (T), insoluble pellet(P) and supernatant (S) were analyzed on 10% reducing SDS-PAGE.

As illustrated in FIG. 11, F4(p51) was expressed at a high level (10% oftotal protein), similar to F4. Almost all F4(p51) is recovered in thesoluble fraction (S) of cellular extracts. Upon detection with ananti-Nef-tat reagent, F4(p51) the WB pattern was shown to be simplified(reduction of truncated products below +/−60 kDa).

F4(p51)* Expression in B834(DE3) Cells:

F4(p51)* recombinant strain was induced at 22° C. over 18 h, in parallelto F4(p51) non-mutated construct, F4 and F4*. Crude cellular extractswere prepared and analyzed by Coomassie stained gel and Westernblotting. As illustrated in FIG. 12 high expression of F4(p51) andF4(p51)* fusions was observed, representing at least 10% of totalprotein. WB pattern: reduction of truncated products below +/−60 kDa. Inaddition, for F4(p51)* construct, the 47 kDa band (due to internal startsite) has disappeared.

Example 7 Purification of F4, F4(p51)* and F4*—Purification Method I

The fusion protein F4, comprising the 4 HIV antigens p24-RT-Nef-p17, waspurified from a E. coli cell homogenate according to purification methodI, which comprises the following principal steps:

-   -   Ammonium sulfate precipitation of F4    -   SO3 Fractogel cation-exchange chromatography (positive mode)    -   Octyl sepharose hydrophobic interaction chromatography (positive        mode)    -   Q sepharose FF anion-exchange chromatography (positive mode)    -   Superdex 200 gel filtration chromatography in presence of SDS    -   Dialysis and concentration

Additionally, the F4(p51)* fusion protein (RT replaced by the codonoptimized p51 carrying an additional mutation Met592Lys) and the F4*protein (F4 carrying an additional Met592Lys mutation) were purifiedusing the same purification method I.

Protein Quantification

-   -   Total protein was determined using the Lowry assay. Before        measuring the protein concentration all samples are dialyzed        overnight against PBS, 0.1% SDS to remove interfering substances        (urea, DTT). BSA (Pierce) was used as the standard.

SDS-PAGE and Western Blot

-   -   Samples were prepared in reducing or non-reducing SDS-PAGE        sample buffer (+/−β-mercaptoethanol) and heated for 5 min at 95°        C.    -   Proteins were separated on 4-20% SDS-polyacrylamide gels at 200        V for 75 min using pre-cast Novex Tris-glycine gels or Criterion        gels (Bio-Rad), 1 mm thick.    -   Proteins were visualized with Coomassie-blue R250.    -   For the western blots (WB), the proteins were transferred from        the SDS-gel onto nitrocellulose membranes (Bio-Rad) at 4° C. for        1.5 h at 100 V or overnight at 30 V.    -   F4 was detected using monoclonal antibodies against the        different antigens, anti-p24, anti-Nef-Tat, anti-RT (sometimes a        mixture of anti-p24 and anti Nef-Tat was used to detect a        maximum number of protein bands).    -   Alkaline-phosphatase conjugated anti-mouse or anti-rabbit        antibodies were bound to the primary antibodies and protein        bands were visualized using BCIP and NBT as the substrates.

Anti-E. Coli Western Blot

-   -   5 μg protein (Lowry) were separated by SDS-PAGE and transferred        onto nitrocellulose membranes as above.    -   Residual host cell proteins were detected using polyclonal        anti-E. coli antibodies. Protein bands were visualized with the        alkaline-phosphatase reaction as above.

Purification Method I

Method I comprises a precipitation by ammonium sulfate and fourchromatographic steps:

-   -   E. coli cells were homogenized in 50 mM Tris buffer at pH 8.0 in        the presence of 10 mM DTT, 1 mM PMSF, 1 mM EDTA at OD50 (˜360        ml). 2 Rannie passages were applied at 1000 bars.    -   Cells debris and insoluble material were removed by        centrifugation at 14400×g for 20 min.    -   Ammonium sulfate (AS) was added from a 3.8M stock solution to        the clarified supernatant to a final concentration of 1.2M.        Proteins were precipitated for ˜2 hours at room temperature (RT)        and then pelleted by centrifugation (10 min at 14400×g). The        pellet was resuspended in 8M urea, 10 mM DTT in 10 mM phosphate        buffer at pH 7.0.

The antigen was captured on a SO3 Fractogel column (Merck) in thepresence of 8M urea and 10 mM DTT at pH 7.0 in phosphate buffer. Thecolumn was washed to elute non-bound protein followed by a pre-elutionstep with 170 mM NaCl to remove bound host cell proteins (HCP). F4 wasthen eluted with 460 mM NaCl, 8M urea, 10 mM DTT in phosphate buffer atpH 7.0.

-   -   The SO3 eluate was 2 fold diluted with 10 mM phosphate buffer,        pH 7, and loaded onto a Octyl sepharose column (Amersham        Biosciences) in the presence of 4M urea, 1 mM DTT, 230 mM NaCl        in phosphate buffer at pH 7.0. Following a washing step        (equilibration buffer) bound F4 was eluted with 8M urea, 1 mM        DTT in 25 mM Tris buffer at pH 8.0.    -   The Octyl eluate was diluted and adjusted to pH 9.0 and F4 was        then bound to an Q sepharose column (Amersham Bioscience) in the        presence of 8M urea at pH 9.0 (25 mM Tris). Unbound protein was        washed off (8M urea, 25 mM Tris at pH 9.0) and a pre-elution        step (90 mM NaCl in 8M urea, 25 mM Tris, pH 9.0) removed HCP and        F4-degradation products. F4 was desorped from the column with        200 mM NaCl, 8M urea in Tris buffer at pH 9.0.    -   An aliquot of the Q eluate was spiked with 1% SDS and dialyzed        against PBS buffer containing 0.1% SDS and 1 mM DTT to remove        the urea prior to injecting the sample onto the gel filtration        column (prep grade Superdex 200, two 16×60 cm columns connected        in a row). The relevant fractions were pooled after in-process        SDS-PAGE analysis.    -   Samples were dialyzed twice at RT in dialysis membranes (12-14        kDa cut-off) overnight against 1 l 0.5M Arginine, 10 mM Tris, 5        mM Glutathione, pH 8.5.

The sequential purification steps are shown in the flowchart below.

Results Purification of F4 SDS-PAGE/Western Blot Follow-Up of thePurification Process

FIG. 13 shows the SDS gel and the anti-p24/anti-Nef-Tat western blot ofthe F4-containing fractions collected during the purification of F4.

The E. coli homogenate is shown in FIG. 13, lane 2, with F4 estimated torepresent about 10% of the total proteins (density scans of Coomassieblue stained SDS-gels). After centrifugation, the soluble fraction of F4was recovered in the clarified supernatant (lane 3). The ammoniumsulfate precipitation step eliminated many impurities (lane 4) andreduced the proteic charge for the subsequent chromatographic step.Additionally, the 8M urea used to resuspend the precipitate dissociatedcomplexes of F4 with HCP and allowed both complete capture of F4 by andquantitative elution from the SO3 resin. The SO3 eluate shown in lane 5was considerably enriched in F4 but the heterogeneous pattern remainedprincipally unchanged. The hydrophobic Octyl sepharose column mainlyremoved low molecular weight (LMW) HCP and F4-degradation products (lane6), thereby simplifying the F4 pattern. The Q sepharose chromatographyfurther simplified the F4 pattern and removed many impurities (lane 7).Final purity in terms of E. coli impurities was obtained after thisstep. In fact, no host cell proteins were detected in the Q eluate byanti-E. coli western blot analysis. The purified F4 thus produced isreferred to as F4Q. The Superdex 200 column separated LMW F4-degradationproducts from the full length F4 improving F4 homogeneity in theSuperdex 200 eluate (lane 8). The term F4S may be used to refer to F4purified according to the full scheme of method I.

An anti-E. coli western blot was done of the same fractions collectedduring the purification of F4. The absence of visible bands on theanti-E. coli western blot indicated HCP contamination below 1% in the Qeluate and in the Superdex eluate.

F4 and Protein Recovery

F4 recovery at each step of the purification process was estimated fromSDS-PAGE and western blot analysis. To estimate F4 recovery fromSDS-gels, the sample volumes loaded onto the SDS-gels corresponded tothe volumes of the different fractions collected during thepurification.

Table 1 displays the protein recovery in the F4-containing fractions.

TABLE 1 Protein recovery in the F4-positive fractions collected duringthe purification process (360 ml homogenate). The protein concentrationwas determined with the Lowry assay. Protein Step Recovery Cum. RecoveryPurification Step (mg) (%) (%) homogenate 6500 100 100 clarifiedhomogenate 4641 71 71 resuspended AS precipitate 728 16 11 SO3 eluate247 34 3.8 Octyl sepharose eluate 129 52 2.0 Q sepharose eluate 74 571.1 Superdex 200 36 49 0.6

The table shows the amount of protein in the homogenate and the solublematerial, including F4, recovered in the supernatant after theclarification step. The AS-precipitation step removed a great amount ofHCP and only a slight loss of F4 was observed on the SDS-gel. The SO3chromatography additionally removed many impurities and the SDS-gelindicated a high recovery of F4. In contrast, the ˜50% protein recoverymeasured with both the Octyl sepharose and the Q sepharose columns werealso accompanied by losses of F4. Protein recovery after the gelfiltration chromatography was about 50%. The SDS-gel shows that manyLMW-protein bands (F4-degradation bands) were removed, concomitantlyreducing F4 recovery.

F4 Yield

Table 1 above shows that about 36 mg purified F4 could be obtained from360 ml homogenate at OD50. Therefore, 1 l homogenate at OD 50 shouldyield about 100 mg purified F4. Since ODs of 70-90 were achieved duringthe fermentation process, the yield per liter fermenter wouldaccordingly be in the range of 140 to 180 mg F4.

Results Purification of F4(p51)*

The F4(p51)* fusion construct was purified using purification method Idescribed above without modifications.

SDS-PAGE/Western Blot Follow-Up of the Purification Process

FIG. 14 shows the SDS gel and the anti-p24/anti-Nef-Tat western blot ofthe F4(p51)*-containing fractions collected during the purification ofF4(p51*).

The SDS-gel and the western blot demonstrate that the F4(p51)* fusionprotein globally behaved similarly to F4 at the ammonium sulfateprecipitation step as well as during the chromatographic steps. PurifiedF4(p51)* had a heterogeneity pattern similar to purified F4.

An anti E. coli western blot indicated that HCP contamination was below1% in both the Q eluate and the Superdex eluate.

Yield

About 25% of F4(p51)* were lost in the insoluble fraction of thehomogenate. Additionally, because the purification method was notadapted to this protein, losses were observed at the chromatographicsteps. Therefore the overall recovery of F4(p51)* was reduced to about25 mg per liter homogenate (OD50). Extrapolated to 1 litre culture at OD177, the yield would accordingly be in the range of 85 mg F4(p51)*.

Results Purification of F4*

The F4* fusion construct was purified using purification method Idescribed above without modifications.

SDS-PAGE/Western Blot Follow-Up of the Purification Process

FIG. 15 shows the SDS gel and the anti-p24/anti-Nef-Tat western blot ofthe F4*-containing fractions collected during the purification of F4*.

As with F4(p51)* it can also be noted that F4* globally behaved quitesimilarly to F4 during the purification procedure. The protein wasrecovered in the expected fractions as shown by the SDS-gel and thewestern blot. An anti-E. coli western blot also demonstrated eliminationof most HCP already after the Q sepharose column.

Yield

The global recovery was about 17 mg purified F4* obtained from 465 mlhomogenate OD50. Extrapolated to 1 l culture at OD 140, the yield wouldaccordingly be in the range of 100 mg F4*.

In summary, the three fusion proteins F4, F4(p51)* and F4* were purifiedemploying purification method I. The SDS gel in FIG. 16 compares thethree purified proteins showing the different level of heterogeneity ofthe constructs after the Q sepharose step and after elimination of LMWbands by the Superdex 200 column.

Example 8 Purification of F4 and F4Co (Codon Optimized)—PurificationMethod II Purification Method II

A simplified purification procedure, method II as compared to method I,was also developed. Method II consists of only 2 chromatographic stepsand a final dialysis/diafiltration for buffer exchange. Notably, a CMhyperZ chromatographic column (BioSepra) was introduced to replace theclarification step, the ammonium sulfate precipitation and the SO3chromatography of method I (Example 7). Method II was used to purifyboth F4 and full-codon optimized F4 (“F4co”). For F4co, two differentforms of method II were performed, one involving carboxyamidation andone not. The purpose of the carboxyamidation step was to preventoxidative aggregation of the protein. This carboxyamidation is performedafter the 1^(st) chromatographic step (CM hyperZ).

-   -   E. coli cells (expressing F4 or F4co) were homogenized in 50 mM        Tris buffer at pH 8.0 in the presence of 10 mM DTT, at OD90. 2        Rannie passages were applied at 1000 bars.    -   8M urea were added to the homogenate before application to the        CM hyperZ resin (BioSepra) equilibrated with 8M urea in        phosphate buffer at pH 7. Antigen capture was done in a batch        mode. The resin was then packed in a column, unbound proteins        were washed off with the equilibration buffer and bound host        cell proteins (HCP) were removed by a pre-elution step with 120        mM NaCl. F4co was then eluted with 360 mM NaCl, 8M urea, 10 mM        DTT in phosphate buffer at pH 7.0.    -   To control oxidative aggregation of the fusion protein, the        cysteine groups of F4co can be carboxyamidated with        idoacetamide. Therefore, optionally, 50 mM iodoacetamide was        added to the CM hyperZ eluate and carboxyamidation was done for        30 min at room temperature in the dark.    -   The CM hyperZ eluate was then adequately diluted (about 5-8        fold) and adjusted to pH 9.0. F4co or F4coca was then bound to a        Q sepharose column (Amersham Bioscience) in the presence of 8M        urea in Tris buffer at pH 9.0. Unbound protein was washed off        with the equilibration buffer and a pre-elution step with 90 mM        NaCl (only with non-carboxyamidated protein) in the same buffer        removed bound HCP. F4co was desorped from the column with 200 mM        NaCl, 8M urea in Tris buffer at pH 9.0.    -   Samples were dialyzed twice at RT in dialysis membranes (12-14        kDa cut-off) overnight against 1 l 0.5M Arginine, 10 mM Tris        buffer, 10 mM Glutathione (only added to the non-carboxyamidated        protein), pH 8.5. Alternatively, buffer exchange was        accomplished by diafiltration against 10 sample volumes of the        same buffer using a tangential-flow membrane with 30 or 50 kDa        cut-off.    -   Finally, the dialyzed product was sterile filtered through a        0.22 μm membrane.

The sequential purification steps are shown in the flowchart below.

All buffers contained DTT if F4co was not carboxyamidated andGlutathione in the purified bulk. Reducing agents were omitted once theprotein was carboxyamidated. *NaCl—for F4co this was 200 mM NaCl, forF4coca elution was by gradient of NaCl. This step can be furtheroptimized for F4coca by pre-eluting with 60 mM NaCl and eluting with 100mM NaC; and for F4co by eluting with 100 mM NaCl (no pre-elution stepneeded).

Results: Purification of F4co

FIG. 17 shows a SDS gel of the F4-containing fractions collected duringthe purification of F4co and the purification of carboxyamidated F4co(“F4coca”).

The CM hyperZ resin completely captured F4co from the crude homogenate(lane 1) in the presence of 8M urea and quantitative elution wasachieved with 360 mM NaCl. The CM hyperZ eluate shown in lane 2 wasconsiderably enriched in F4co. After appropriate dilution and adjustmentof the sample to pH 9, F4co or F4coca was bound to a Q sepharose column.F4co or F4coca was then specifically eluted with 200 mM NaCl as shown inlane 3. This chromatography not only removed remaining host cellproteins but also DNA and endotoxins. To bring the purified material ina formulation-compatible buffer, the Q sepharose eluate was dialyzedagainst 10 mM Tris buffer, 0.5M Arginine, 10 mM Glutathione pH 8.5 in adialysis membrane with 12-14 kDa cut-off. Glutathione was omitted withthe carboxyamidated protein.

Purification of both F4co and F4coca yielded about 500 mg purifiedmaterial per L of culture OD130. This was in a similar range as observedbefore with the non-codon-optimized F4.

As described above, two different purification methods (I and II) havebeen developed to purify the different F4 constructs. FIG. 18 comparesthe different purified bulks that were obtained.

The SDS gel in FIG. 18 clearly illustrates the distinct pattern of thetwo different proteins, F4 and F4co. Whereas F4 presented several stronglow molecular weight (LMW) bands, only faint bands were visible with thecodon-optimized F4co. Method I and method II produce a very similar F4copattern. Anti-E. coli western blot analysis confirmed the purity of thepurified proteins indicating host cell protein contamination below 1% inall the preparations.

Example 9 Immunogenicity of F4 in Mice Formulation: Adjuvant Formulation1B:

To prepare Adjuvant formulation 1 B, A mixture of lipid (such asphosphatidylcholine either from egg-yolk or synthetic) and cholesteroland 3 D-MPL in organic solvent, is dried down under vacuum (oralternatively under a stream of inert gas). An aqueous solution (such asphosphate buffered saline) is then added, and the vessel agitated untilall the lipid is in suspension. This suspension is then microfluidiseduntil the liposome size is reduced to about 100 nm, and then sterilefiltered through a 0.2 μm filter. Extrusion or sonication could replacethis step.

Typically the cholesterol:phosphatidylcholine ratio is 1:4 (w/w), andthe aqueous solution is added to give a final cholesterol concentrationof 5 to 50 mg/ml.

The liposomes have a defined size of 100 nm and are referred to as SUV(for small unilamelar vesicles). If this solution is repeatedly frozenand thawed the vesicles fuse to form large multilamellar structures(MLV) of size ranging from 500 nm to 15 μm.

The liposomes by themselves are stable over time and have no fusogeniccapacity. QS21 in aqueous solution is added to the liposomes to reach afinal 3 D-MPL and QS21 concentrations of 100 g/ml.

Formulation 2A: 3 De acylated monophoshphoryl lipid A and QS21 in an oilin water emulsion;

Preparation of oil in water emulsion can be made by following theprotocol as set forth in WO 95/17210. In detail the emulsion contains:5% Squalene 5% tocopherol 2.0% tween 80; the particle size is 180 nm.

Preparation of Oil in water emulsion (2 fold concentrate)

Tween 80 is dissolved in phosphate buffered saline (PBS) to give a 2%solution in the PBS. To provide 100 ml two fold concentrate emulsion 5 gof DL alpha tocopherol and 5 ml of squalene are vortexed to mixthoroughly. 90 ml of PBS/Tween solution is added and mixed thoroughly.The resulting emulsion is then passed through a syringe and finallymicrofluidised by using an M110S microfluidics machine. The resultingoil droplets have a size of approximately 180 nm.

Sterile bulk emulsion is added to PBS to reach a final concentration of500 l of emulsion per ml (v/v). 3 D-MPL is then added to reach a finalconcentration of 100 μg. QS21 is then added to reach a finalconcentration of 100 mg per ml. Between each addition of component, theintermediate product is stirred for 5 minutes

F4Q not codon optimized, purified according to purification method I,was diluted in a phosphate/Arginine buffer pH 6.8. The dilution wasmixed with two different concentrated adjuvants (adjuvants 2A and 1B) inorder to obtain a final formulation of 40 μg/dose of 500 μl of F4 inpresence of 290 (for adjuvant 2A)-300 (for adjuvant 1B) mM Argnine, 50μg MPL and 50 μg QS21. 100 μl of each formulation were injected in mice.

Mouse immunogenicity studies were performed to evaluate the cellular andhumoral immune responses to the four antigens found within F4 (p24, p17,RT and Nef).

Due to the complexity of the F4 antigen, eight strains of mice, eachwith a different genetic background, were immunised twice at day 0 andday 21 with 8 g of adjuvanted F4 protein prepared as described above, ina 100 μl volume. Serum and spleen samples were collected 14 daysfollowing the last immunisation (day 35) for analysis of the humoral andcellular responses to each of the four components of F4 (p24, p17, RTand Nef), as well as F4.

Total antibody responses were characterised by ELISAs specific for p24,p17, RT, Nef and F4. The following table, Table 2, summarises whereantigen specific humoral responses were observed in each strain. Theresults indicate the presence or absence of antibodies compared tocontrol animals immunized with adjuant alone. The results presented area compilation from two separate but identical experiments. In the table,2A refers to antigen formulated with 3D-MPL and QS21 in an oil in wateremulsion and 1B refers to antigen formulated with 3D-MPL, QS21 andcholesterol containing liposomes.

TABLE 2 mouse strain p17 p24 Nef RT F4 CB6F1 +/− + + + + +2A −1B Balb/c− + + + + +/−2A −1B    C3H − − − − − DBA − + + + + CBA − − +/− + + +2A−1B +2A +/−1B 129Sv − + + + + B6D2F1 +/− + + + + +2A −1B OF1 + + + + + += presence of antibodies − = absence of antibodiesOF1 mice mounted antibody responses to all four F4 components. Theresponses observed are shown in FIG. 19. +/− indicates that the responseobserved was weak or only observed with one of the two adjuvant. Forexample, B6D2F1 mice p17 responses: +/− overall with a +2A and −1B meansthat there was a response with 2A (not weak) and none with 1B. Balb/cmice p17 responses: − overall, with a +/−2A and a −1B, here the +/−means that the response with adjuvant 2A was weak.

Cellular responses were characterised by flow cytometry staining for CD4and CD8, IFN and IL-2 expression (intracellular cytokine staining forIFN and IL-2 expression), following restimulation of spleen cells withp24, p17, RT or Nef specific peptides, using peptide library pools of 15mers with 11 mer overlap. CD4 responses were the dominant cellularresponse observed. The following table, Table 3, summarises whereantigen specific CD4+IL-2+ responses were observed for each mousestrain. Again, this is shown as presence or absence of a response.

TABLE 3 mouse strain p17 p24 Nef RT CB6F1 − + + + +/−2A −1B Balb/c −+/−weak +/−weak + C3H + + − + DBA + + + + CBA + + − + −2A +1B 129Sv + +− + B6D2F1 − + + + OF1 − − − + + = presence of CD4+IL-2+ − = absence ofCD4+IL-2+

DBA mice mounted CD4 responses to all four F4 components. The CD4+IL-2+and CD4+ IFN+ responses observed for this mouse strain are shown in FIG.20.

In summary, F4 formulated in either of the two adjuvant formulations isable to promote humoral and cellular responses to p24, p17, RT and Nef.This shows that each region of F4 is immunogenic in an in vivosituation.

1. A polynucleotide encoding a fusion polypeptide comprising a Nefpolypeptide, an RT polypeptide, a p17 Gag polypeptide, and a p24 Gagpolypeptide wherein there is at least one HIV antigen between the p17Gag polypeptide and the p24 Gag polypeptide.
 2. The polynucleotide ofclaim 1, wherein the RT polypeptide is p66.
 3. The polynucleotide ofclaim 1, wherein the RT polypeptide is truncated at the C terminus suchthat it lacks the carboxy terminal RNase H domain.
 4. The polynucleotideof claim 1, wherein the RT polypeptide is p51.
 5. The polynucleotide ofclaim 1, wherein the fusion protein comprises from N-terminal toC-terminal: p24-RT-Nef-p17.
 6. The polynucleotide of claim 5, whereinthe fusion protein comprises from N-terminal to C-terminal:p24-p51RT-Nef-p17.
 7. The polynucleotide of claim 1, wherein the fusionprotein comprises SEQ ID NO:2.
 8. The polynucleotide of claim 7,comprising SEQ ID NO:1.
 9. The polynucleotide of claim 1, wherein thefusion protein comprises SEQ ID NO:15.
 10. The polynucleotide of claim9, comprising SEQ ID NO:14.
 11. The polynucleotide of claim 1, whereinthe fusion protein comprises SEQ ID NO:19.
 12. The polynucleotide ofclaim 11, comprising SEQ ID NO:18.
 13. The polynucleotide of claim 5,wherein the amino acid at the position corresponding to position 592 inSEQ ID NO:2 is not methionine.
 14. The polynucleotide of claim 13, wheresaid amino acid is lysine.
 15. A pharmaceutical composition comprisingthe polynucleotide of claim 1 and a pharmaceutically acceptable carrier.16. The pharmaceutical composition of claim 15, further comprising anadjuvant.
 17. The pharmaceutical composition of claim 16, wherein theadjuvant comprises a Th1 inducing adjuvant.
 18. The pharmaceuticalcomposition of claim 17, wherein the TH1 inducing adjuvant comprisesQS21, 3D-MPL, or a combination of QS21 and 3D-MPL.